unroll_and_jam pragma ignored but no reason specified

April 20, 2017, 11:00 pm

Latest and popular articles on Intel Technologies

≫ Next: improper PDB files handling by ICC on Windows

≪ Previous: Ilegal instruccion on Xeon E5472

Hi,

Consider the following C++ code:

#include <malloc.h>
#include <cmath>
#include <complex>

int main(int argc, char **argv) {
    int N = 4000000;
    double * _arr_4_0;
  _arr_4_0 = (double *) (malloc((sizeof(double) * (unsigned long) (5331.0))));
  for (int _i0 = 0; (_i0 <= 5330); _i0 = (_i0 + 1))
  {
    _arr_4_0[_i0] = std::sin(_i0);
  }
    double * _arr_7_7;
  _arr_7_7 = (double *) (malloc((sizeof(double) * (unsigned long) (((0.1 * (double) (N)) + -66.0)))));
  #pragma omp parallel for schedule(static)
  #pragma ivdep
  for (int _i0 = 0; (_i0 < ((N / 10) - 66)); _i0 = (_i0 + 1))
  {
    _arr_7_7[_i0] = std::sqrt(_i0);
  }
    std::complex<double> * _arr_6_8;
  _arr_6_8 = (std::complex<double> *) (malloc((sizeof(std::complex<double>) * (unsigned long) (((0.1 * (double) (N)) + -5396.0)))));
  for (int o1 = 0; (o1 < (((N + 110) / 320) - 168)); o1 = (o1 + 1))
  {
    int _ct167 = ((((32 * o1) + 31) < ((N / 10) - 5397))? ((32 * o1) + 31): ((N / 10) - 5397));
    for (int o2 = (32 * o1); (o2 <= _ct167); o2 = (o2 + 1))
    {
      _arr_6_8[o2] = (0.0 + 0.0j);
    }
  }
  #pragma omp parallel for schedule(static)
  for (int o1 = 0; (o1 < (((N + 110) / 320) - 168)); o1 = (o1 + 1))
  {
    for (int o2 = 0; (o2 <= 166); o2 = (o2 + 1))
    {
      int _ct168 = ((((32 * o1) + 31) < ((N / 10) - 5397))? ((32 * o1) + 31): ((N / 10) - 5397));
      #pragma unroll_and_jam (6)
      for (int o3 = (32 * o1); (o3 <= _ct168); o3 = (o3 + 1))
      {
        int _ct169 = ((5330 < ((32 * o2) + 31))? 5330: ((32 * o2) + 31));
        #pragma ivdep
        for (int o4 = (32 * o2); (o4 <= _ct169); o4 = (o4 + 1))
        {
          _arr_6_8[o3] = (_arr_6_8[o3] + (_arr_7_7[((5330 - o4) + o3)] * _arr_4_0[o4]));
        }
      }
    }
  }
    return 0;
}

I compiled this using the following command (file saved as test.cpp):

icpc -O3 -qopenmp -qopt-report=5 -qopt-report-file=stdout test.cpp > optrpt

However, I get a warning on stderr which says:

test.cpp(38): (col. 7) remark: unroll_and_jam pragma will be ignored due to

There is no reason specified for why the pragma is being ignored. Could you please help me diagnose this?

icpc -V

gives

Intel(R) C++ Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 17.0.1.132 Build 20161005
Copyright (C) 1985-2016 Intel Corporation.  All rights reserved.

This bug is also present on

Intel(R) C++ Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 17.0.2.174 Build 20170213
Copyright (C) 1985-2017 Intel Corporation.  All rights reserved.

Any suggestions on how to debug this would be appreciated.

Thanks,
Abhinav

↧

improper PDB files handling by ICC on Windows

April 23, 2017, 9:08 am

Latest and popular articles on Intel Technologies

≫ Next: How can I solve this Read-After-Write dependency?

≪ Previous: unroll_and_jam pragma ignored but no reason specified

Hello, everyone,

Found, that ICC handles PDB files in somewhat different way than MSVC do, which could lead to errors during build. E.g. for OpenSSL builds using ICC on Windows got:
1. improper PDB files naming:

[run 'Compiler 17.0 Update 2 for Intel 64 Visual Studio 2015 environment' command && change into OpenSSL sources directory]
c:\libOPENSSL-1.1.1-dev\build>set CC=icl
c:\libOPENSSL-1.1.1-dev\build>perl Configure threads no-deprecated shared no-asm VC-WIN64A && nmake
[snip]"C:\ProgramData\Perl64\bin\perl.exe""-I." -Mconfigdata "util\dofile.pl""-omakefile""crypto\include\internal\bn_conf.h.in"> crypto\include\internal\bn_conf.h"C:\ProgramData\Perl64\bin\perl.exe""-I." -Mconfigdata "util\dofile.pl""-omakefile""crypto\include\internal\dso_conf.h.in"> crypto\include\internal\dso_conf.h"C:\ProgramData\Perl64\bin\perl.exe""-I." -Mconfigdata "util\dofile.pl""-omakefile""include\openssl\opensslconf.h.in"> include\openssl\opensslconf.h
        icl  /I "." /I "crypto\include" /I "include" -DOPENSSL_USE_APPLINK -DDSO_WIN32 -DNDEBUG -DOPENSSL_THREADS -DOPENSSL_NO_STATIC_ENGINE -DOPENSSL_PIC -DOPENSSL_API_COMPAT=0x10100000L "-DENGINESDIR=\"C:\\Program Files\\OpenSSL\\lib\\engines-1_1\"""-DOPENSSLDIR=\"C:\\Program Files\\Common Files\\SSL\"" -W3 -wd4090 -Gs0 -GF -Gy -nologo -DOPENSSL_SYS_WIN32 -DWIN32_LEAN_AND_MEAN -DL_ENDIAN -D_CRT_SECURE_NO_DEPRECATE -DUNICODE -D_UNICODE /MD /O2 /Zi /Fdossl_static -c /Focrypto\aes\aes_cbc.obj "crypto\aes\aes_cbc.c"
aes_cbc.c
error #31000: corrupt PDB file ossl_static\vc140.pdb; delete and rebuild; if problem persists, delete and try /Z7 instead
error code 3 ( can't write file, out of disk, etc.) opening pdb ossl_static\vc140.pdb
compilation aborted for crypto\aes\aes_cbc.c (code 1)
NMAKE : fatal error U1077: '"C:\Program Files (x86)\IntelSWTools\compilers_and_libraries\windows\bin\intel64\icl.EXE"' : return code '0x1'
Stop.

which relate to compiler keys
'/Fdossl_static', '/Fddso' and '/Fdapp'
in file 'Configurations/10-main.conf' from OpenSSL sources folder.

And if they are changed to
'/Fdossl_static.pdb', '/Fddso.pdb' and '/Fdapp.pdb'
before build, PDB files are created with proper names
'ossl_static.pdb', 'dso.pdb' and 'app.pdb',
and all 'nmake' tasks finishes successfully.

Error not reproduces for builds using MSVC. E.g. ICC, unlike MSVC, require explicitly specify '.pdb' extension of PDB files in '/Fd' keys. Otherwise ICC set PDB file names to '...\vc140.pdb'.

2. improper PDB files handling for parallel builds:

[extract 'jom' and 'nasm' executables to OpenSSL directory && run 'Compiler 17.0 Update 2 for Intel 64 Visual Studio 2015 environment' command && change into OpenSSL sources directory]
c:\libOPENSSL-1.1.1-dev\build>set CC=icl
c:\libOPENSSL-1.1.1-dev\build>perl Configure -QxAVX -fp:strict -Qprec -Zc:wchar_t -nologo -Qstd=c11 -O3 -MD threads no-deprecated shared VC-WIN64A && jom /J4
[snip]
        icl  /I "include" -DOPENSSL_USE_APPLINK -DDSO_WIN32 -DNDEBUG -DOPENSSL_THREADS -DOPENSSL_NO_STATIC_ENGINE -DOPENSSL_PIC -DOPENSSL_API_COMPAT=0x10100000L "-DENGINESDIR=\"C:\\Program Files\\OpenSSL\\lib\\engines-1_1\"""-DOPENSSLDIR=\"C:\\Program Files\\Common Files\\SSL\"" -W3 -wd4090 -Gs0 -GF -Gy -nologo -DOPENSSL_SYS_WIN32 -DWIN32_LEAN_AND_MEAN -DL_ENDIAN -D_CRT_SECURE_NO_DEPRECATE -DUNICODE -D_UNICODE /MD /O2  -QxAVX -fp:strict -Qprec -Zc:wchar_t -nologo -Qstd=c11 -O3 -MD /Zi /Fdapp.pdb -c /Fotest\buildtest_x509v3.obj "test\buildtest_x509v3.c"
icl: command line warning #10120: overriding '/O2' with '/O3'
buildtest_x509v3.c
        IF EXIST test\shlibloadtest.exe.manifest DEL /F /Q test\shlibloadtest.exe.manifest
error #31000: corrupt PDB file app.pdb; delete and rebuild; if problem persists, delete and try /Z7 instead
error code 3 ( can't write file, out of disk, etc.) opening pdb app.pdb
compilation aborted for test\buildtest_x509.c (code 1)
jom: C:\libOPENSSL-1.1.1-dev\build\Makefile [test\buildtest_x509.obj] Error 1
        link /nologo /debug /subsystem:console /opt:ref /out:test\shlibloadtest.exe @C:\Users\test\AppData\Local\Temp\shlibloadtest.exe.5752.221687.jom
        IF EXIST test\shlibloadtest.exe.manifest  mt -nologo -manifest test\shlibloadtest.exe.manifest -outputresource:test\shlibloadtest.exe
jom: C:\libOPENSSL-1.1.1-dev\build\Makefile [all] Error 2

Error reproduced for Release+Shared configuration if jobs number > 3 (other configurations require more; need be verified). Running 'nmake' or 'jom /J1' command from that point continue build and successfully finishes all tasks.

ICC has no appropriate compiler key for PDB files handling during parallel builds (similar to '/FS' for MSVC). Then assumed, that ICC dealing with this "things" by itself. In such case it turns out, that ICC has limitation on parallel jobs number for single PDB file processing, which MSVC hasn't.

Since ICC on Windows imitates MSVC, is it possible to make its behavior regarding PDB files identical to MSVC for both cases above?

Environment:

Windows 10 x64,
IPSXE 2017 Update 2,
VS 2015 Update 3,
Windows SDK 10.0.14393.33,
Active Perl 5.24.1,
nasm 2.13rc20,
jom 1.1.2,
OpenSSL 1.1.1-dev (f919c12f5c8b92f0318c650573e774fe6522c27c).

Alexander

Zone:

Windows*

Thread Topic:

Bug Report

↧

How can I solve this Read-After-Write dependency?

April 25, 2017, 1:46 pm

Latest and popular articles on Intel Technologies

≫ Next: OneCore aka Universal Windows Driver compatibility

≪ Previous: improper PDB files handling by ICC on Windows

I'm trying to vectorize the inner `for` of this function:

    void SIFTDescriptor::samplePatch(float *vec)
    {
       for (int r = 0; r < par.patchSize; ++r)
       {
          const int br0 = par.spatialBins * bin0[r]; const float wr0 = w0[r];
          const int br1 = par.spatialBins * bin1[r]; const float wr1 = w1[r];
          for (int c = 0; c < par.patchSize; ++c)
          {
             const float val = mask.at<float>(r,c) * grad.at<float>(r,c);

             const int bc0 = bin0[c];
             const float wc0 = w0[c]*val;
             const int bc1 = bin1[c];
             const float wc1 = w1[c]*val;

             // ori from atan2 is in range <-pi,pi> so add 2*pi to be surely above zero
             const float o = float(par.orientationBins)*(ori.at<float>(r,c) + 2*M_PI)/(2*M_PI);

             int   bo0 = (int)o;
             const float wo1 =  o - bo0;
             bo0 %= par.orientationBins;

             int   bo1 = (bo0+1) % par.orientationBins;
             const float wo0 = 1.0f - wo1;

             // add to corresponding 8 vec...
             if (wr0*wc0>0) {
                 vec[br0+bc0+bo0] += wr0*wc0 * wo0;
                 vec[br0+bc0+bo1] += wr0*wc0 * wo1;
             }
             if (wr0*wc1>0) {
                 vec[br0+bc1+bo0] += wr0*wc1 * wo0;
                 vec[br0+bc1+bo1] += wr0*wc1 * wo1;
             }
             if (wr1*wc0>0) {
                 vec[br1+bc0+bo0] += wr1*wc0 * wo0;
                 vec[br1+bc0+bo1] += wr1*wc0 * wo1;
             }
             if (wr1*wc0>0) {
                 vec[br1+bc1+bo0] += wr1*wc0 * wo0;
                 vec[br1+bc1+bo1] += wr1*wc0 * wo1;
             }
          }
       }
    }

However, Intel Advisor tells me that there are two Read-After-Write dependencies in:

     vec[br0+bc0+bo0] += wr0*wc0 * wo0;

And:

     vec[br1+bc0+bo0] += wr1*wc0 * wo0;

How can I solve them? This is the first time that I try to solve such a dependency and I'm struggling a little bit...

↧

OneCore aka Universal Windows Driver compatibility

April 25, 2017, 9:43 pm

Latest and popular articles on Intel Technologies

≫ Next: OpenMP task depend pointers and structs

≪ Previous: How can I solve this Read-After-Write dependency?

Hi.

I want to build a simple "int main(){ return 0; }" program that passes the API Validator OneCore compatibility test.

After reading https://msdn.microsoft.com/en-us/windows/hardware/drivers/develop/getting-started-with-universal-drivers and https://msdn.microsoft.com/en-us/library/windows/desktop/mt654040(v=vs.85).aspx I proceeded to build a simple default solution in VS2015 u3 on my Windows 10 build 14393 system with the program described above.

If my platform toolset is "WindowsApplicationForDrivers10.0" or even "Visual Studio 2015 (v140)", running the API Validator that comes with the Windows SDK as a post-build event succeeds:

"C:\Program Files (x86)\Windows Kits\10\bin\x64\apivalidator.exe" -driverpackagepath:"$(OutDir)\" -SupportedApiXmlFiles:"C:\Program Files (x86)\Windows Kits\10\build\universalDDIs\$(Platform)\UniversalDDIs.xml" -ModuleWhiteListXmlFiles:"C:\Program Files (x86)\Windows Kits\10\build\universalDDIs\$(Platform)\ModuleWhiteList.xml"

If I switch the Platform Toolset to Intel C++ Compiler 17.0 u2, I get warnings that my executable now has a dynamic dependency on ntdll.dll, which is not OneCore compliant. According to my investigations, this is because libirc.lib was linked with a dependency on ntlib.lib and the warnings appear for printf, vsnprintf, scanf, and variations of those functions.

Is it possible to generate a libirc.lib such that it ignores these default dependencies? This would allow the person building the executable or dll to choose the OneCore umbrella library (OneCore.lib) or the traditional desktop library (ntdll.lib) as the means to resolve those dependencies instead of having that decision being made in the libirc.lib.

Thank you.

Zone:

Windows*

↧

OpenMP task depend pointers and structs

April 26, 2017, 8:29 am

Latest and popular articles on Intel Technologies

≫ Next: ICC hangs on scipy 0.19.0 compilation of scipy/special/cython_special.c

≪ Previous: OneCore aka Universal Windows Driver compatibility

I have got the following code:

int foo(pm_t* t_lb){
  int i;
  int BLOCK=4;
#pragma omp task default(none) shared(t_lb, BLOCK)  private(i) \
  depend (inout: t_lb->a)

  {
    for (i=1;i < 1000 + BLOCK; i++){
      
      t_lb->a[i] =1.0f; 
    }
  }
}
int main(){
  
  pm_t* pm = NULL;
  pm = (pm_t*)calloc(1,sizeof(pm_t));
  
  
  pm->a = (double*)malloc(1000*sizeof(double));

#pragma omp parallel
  {
    /* Obtain thread number */
    int tid = omp_get_thread_num();
    printf("Hello World from thread = %d\n", tid);
    
#pragma omp master
    {
      foo(pm);
    }
  }
  
  return 0;
  
}

The structs:

typedef struct st pm_t;

struct st {
  double* a;
  double* t_b;
};

If this is compiled with Intel 17.0.2.174 I get:

tasks.c(10): error: invalid entity for this variable list in omp clause
    depend (inout: t_lb->a)

However I can compile it if I change the depend list to

depend (inout: t_lb)

Is there a way to specify dependencies to the "a" pointer instead to the whole structure "t_lb"?

Zone:

Server

↧

ICC hangs on scipy 0.19.0 compilation of scipy/special/cython_special.c

April 27, 2017, 2:00 pm

Latest and popular articles on Intel Technologies

≫ Next: icl and icl++ command not found

≪ Previous: OpenMP task depend pointers and structs

ICC hangs on scipy 0.19.0 compilation of scipy/special/cython_special.c for 32 bit linux. Works fine on 64 bit linux, and 32 bit with gcc. The process just hangs and won't proceed.

Compiling with numpy 0.12.0 and mkl/icc/ifort. Flags are

Thread Topic:

Bug Report

↧

icl and icl++ command not found

April 28, 2017, 1:21 pm

Latest and popular articles on Intel Technologies

≫ Next: Inefficient memory access pattern and irregular stride accesses

≪ Previous: ICC hangs on scipy 0.19.0 compilation of scipy/special/cython_special.c

Hello there,

I'm trying to use the clang compiler on a Mac OS* X using the Terminal. It turns out the commands icl and icl++ are not recognized. I setup the compiler variables with compilervars.sh. I have Xcode 8.3.2 and the Intel Compiler doesn't seem to be supported at this time. Can anybody help me figure this out?

↧

Inefficient memory access pattern and irregular stride accesses

May 1, 2017, 2:15 am

Latest and popular articles on Intel Technologies

≫ Next: multiple openmp reduction on stl::vector gets wrong results

≪ Previous: icl and icl++ command not found

I don't know if this is the right section, I'm sorry in that case.

I'm trying to optimize this function:

bool interpolate(const Mat &im, float ofsx, float ofsy, float a11, float a12, float a21, float a22, Mat &res)
{
   bool ret = false;
   // input size (-1 for the safe bilinear interpolation)
   const int width = im.cols-1;
   const int height = im.rows-1;
   // output size
   const int halfWidth  = res.cols >> 1;
   const int halfHeight = res.rows >> 1;
   float *out = res.ptr<float>(0);
   const float *imptr  = im.ptr<float>(0);
   for (int j=-halfHeight; j<=halfHeight; ++j)
   {
      const float rx = ofsx + j * a12;
      const float ry = ofsy + j * a22;
      #pragma omp simd
      for(int i=-halfWidth; i<=halfWidth; ++i, out++)
      {
         float wx = rx + i * a11;
         float wy = ry + i * a21;
         const int x = (int) floor(wx);
         const int y = (int) floor(wy);
         if (x >= 0 && y >= 0 && x < width && y < height)
         {
            // compute weights
            wx -= x; wy -= y;
            int rowOffset = y*im.cols;
            int rowOffset1 = (y+1)*im.cols;
            // bilinear interpolation
            *out =
                (1.0f - wy) *
                ((1.0f - wx) *
                imptr[rowOffset+x] +
                wx *
                imptr[rowOffset+x+1]) +
                (       wy) *
                ((1.0f - wx) *
                imptr[rowOffset1+x] +
                wx *
                imptr[rowOffset1+x+1]);
         } else {
            *out = 0;
            ret =  true; // touching boundary of the input
         }
      }
   }
   return ret;
}

I'm using Intel Advisor to optimize it and even though the inner for has already been vectorized, Intel Advisor detected inefficient memory access patterns:

60% of unit/zero stride access
40% of irregular/random stride access

In particular there are 4 gather (irregular) access in the following three instructions:

The problem of gather access from my understanding happens when the accessed element is of the type a[b], where b is unpredictable. This seems to be the case with imptr[rowOffset+x], where both rowOffset and x are unpredictable.

At the same time, I see this Vertical Invariant which should happen (again, from my understanding) when elements are accessed with a constant offset. But actually I don't see where this constant offset

So I have 3 questions:

Did I understood the problem of gather accesses correctly?
What about the Vertical Invariant access? I'm less sure about this point.
Finally, how can I improve/solve the memory access here?

Compiled with icpc 2017 update 3 with the following flags:

INTEL_OPT=-O3 -ipo -simd -xCORE-AVX2 -parallel -qopenmp -fargument-noalias -ansi-alias -no-prec-div -fp-model fast=2 -fma -align -finline-functions
INTEL_PROFILE=-g -qopt-report=5 -Bdynamic -shared-intel -debug inline-debug-info -qopenmp-link dynamic -parallel-source-info=2 -ldl

↧

multiple openmp reduction on stl::vector gets wrong results

May 2, 2017, 12:28 pm

Latest and popular articles on Intel Technologies

≫ Next: AVX-512 X code and icpc error #10236: File not found: ' '

≪ Previous: Inefficient memory access pattern and irregular stride accesses

Hi,

In the following test code, I use omp declare reduction on stl::vector. Using icpc 17.0, omp parallel reduction works correctly on a single stl::vector. However, when I attempt to reduce to two separate vectors, v1 & v2, it returns v1=sum(v1)+sum(v2) and v2=0.0 (where sum is across threads). By my reading of the standard, I would expect it to return v1=sum(v1) and v2=sum(v2). This is what GCC 6.3 does.

Can my omp directives be tweaked to work with icpc 17.0? Or is there another problem?

Thanks in advance,

Sean

(was advised to resubmit this post to C++ forum - original is here: https://software.intel.com/en-us/forums/intel-moderncode-for-parallel-ar...)

// following on from http://stackoverflow.com/questions/43168661/openmp-and-reduction-on-stdvector
//
#include <algorithm>
#include <functional>
#include <iostream>
#include <vector>
#include <omp.h>

#pragma omp declare reduction(+ : std::vector<double> : std::transform(omp_in.begin(),omp_in.end(),omp_out.begin(),omp_out.begin(),std::plus<double>())) initializer (omp_priv=omp_orig)


int check1(int size){
  std::vector<double> result(size,0.0);

  int npart=100;
  int n;
  {
    int i;
// Works in GCC 6.3, Intel 17.0:
#pragma omp parallel for private(i,n) reduction(+:result)
    for (n=0; n<npart; n++){
      for (i=0; i<size; i++){
	result[i]  += i;
      }
    }
  }
  int fail=0;
  for (int i=0; i<size; i++) {
    if (result[i] != i*100){
      fail=1;
    }
    std::cout <<"i="<< i<<""<<result[i]<<std::endl;
  }
  return fail;
}

int check2(int size){
  std::vector<double> result(size,0.0);
  std::vector<double> result2(size,0.0);

  int npart=100;
  int n;
  {
    int i;
// Works in GCC 6.3, fails in Intel 17.0:
//#pragma omp parallel for private(i,n) reduction(+:result),reduction(+:result2)
// Works in GCC 6.3, fails in Intel 17.0:
#pragma omp parallel for private(i,n) reduction(+:result,result2)
    for (n=0; n<npart; n++){
      for (i=0; i<size; i++){
	result[i]  += i;
	result2[i] += i;
      }
    }
  }
  int fail=0;
  for (int i=0; i<size; i++) {
    if (result[i] != i*100 || result2[i] != i*100){
      fail=1;
    }
    std::cout <<"i="<< i<<""<<result[i]<<""<<result2[i]<<std::endl;
  }
  return fail;
}



int main(int argc, char *argv[]) {

  int size;

  if (argc < 2)
    size = 10;
  else
    size = atoi(argv[1]);

  int fail1=check1(size);
  int fail2=check2(size);

  return fail1+fail2;
}

Thread Topic:

Bug Report

↧

AVX-512 X code and icpc error #10236: File not found: ' '

May 3, 2017, 6:32 pm

Latest and popular articles on Intel Technologies

≫ Next: Intel C++ compiler for Intel Itanium architecture

≪ Previous: multiple openmp reduction on stl::vector gets wrong results

Hi,

I am attempting to compile the following test code with ICC 17.0.3 20170404 contained in file name testCXXCompiler.cxx:

int main(){return 0;}

The command line I am using is:

/opt/intel/compilers_and_libraries_2017.3.191/linux/bin/intel64/icpc -xCORE-AVX512 -o testCXXCompiler.cxx.o -c testCXXCompiler.cxx

My machine is below:

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                68
On-line CPU(s) list:   0-67
Thread(s) per core:    1
Core(s) per socket:    68
Socket(s):             1
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 87
Model name:            Intel(R) Xeon Phi(TM) CPU 7250 @ 1.40GHz
Stepping:              1
CPU MHz:               998.101
BogoMIPS:              2793.41
L1d cache:             32K
L1i cache:             32K
L2 cache:              1024K
NUMA node0 CPU(s):     0-67
NUMA node1 CPU(s):

The error is:

icpc: error #10236: File not found: ''

Using the compiler flag -v, I get this:

icpc: error #10236: File not found:  ' '
icpc version 17.0.3 (gcc version 4.8.5 compatibility)
/opt/intel/compilers_and_libraries_2017.3.191/linux/bin/intel64/mcpcom    --target_efi2 --lang=c++ -_g -mP3OPT_inline_alloca -D__ICC=1700 -D__INTEL_COMPILER=1700 -D__INTEL_COMPILER_UPDATE=3 -D__PTRDIFF_TYPE__=long "-D__SIZE_TYPE__=unsigned long" -D__WCHAR_TYPE__=int "-D__WINT_TYPE__=unsigned int""-D__INTMAX_TYPE__=long int""-D__UINTMAX_TYPE__=long unsigned int" -D__LONG_MAX__=9223372036854775807L -D__QMSPP_ -D__OPTIMIZE__ -D__NO_MATH_INLINES -D__NO_STRING_INLINES -D__GNUC_GNU_INLINE__ -D__GNUG__=4 -D__GNUC__=4 -D__GNUC_MINOR__=8 -D__GNUC_PATCHLEVEL__=5 -D__LP64__ -D_LP64 -D_GNU_SOURCE=1 -D__DEPRECATED=1 -D__GXX_WEAK__=1 -D__GXX_ABI_VERSION=1002 "-D__USER_LABEL_PREFIX__= " -D__REGISTER_PREFIX__= -D__INTEL_RTTI__ -D__EXCEPTIONS=1 -D__unix__ -D__unix -D__linux__ -D__linux -D__gnu_linux__ -B -Dunix -Dlinux "-_Asystem(unix)" -D__ELF__ -D__x86_64 -D__x86_64__ -D__amd64 -D__amd64__ "-_Acpu(x86_64)""-_Amachine(x86_64)" -D__INTEL_COMPILER_BUILD_DATE=20170404 -D__INTEL_OFFLOAD -D__pentium4 -D__pentium4__ -D__tune_pentium4__ -D__SSE2__ -D__SSE2_MATH__ -D__SSE3__ -D__SSSE3__ -D__SSE4_1__ -D__SSE4_2__ -D__SSE__ -D__SSE_MATH__ -D__MMX__ -D__AVX__ -D__AVX_I__ -D__AVX2__ -D__FMA__ -D__BETA_BDW__ -D__AVX512F__ -D__AVX512CD__ -D__AVX512DQ__ -D__AVX512BW__ -D__AVX512VL__ -_k -_8 -_l --has_new_stdarg_support -_a -_b --gnu_version=40805 -_W5 --gcc-extern-inline -p --bool -tused -x --multibyte_chars -mGLOB_diag_suppress_sys --array_section --simd --simd_func --offload_mode=1 --offload_target_names=gfx,GFX,mic,MIC --offload_unique_string=icpc1289335044CQUak5 --bool -mGLOB_em64t=TRUE -mP1OPT_version=17.0-intel64 -mGLOB_diag_file=testCXXCompiler.cxx.diag -mGLOB_long_size_64 -mGLOB_routine_pointer_size_64 -mP1OPT_print_version=FALSE -mCG_use_gas_got_workaround=F -mP2OPT_align_option_used=TRUE -mGLOB_gcc_version=485 "-mGLOB_options_string=-xCORE-AVX512 -v -o testCXXCompiler.cxx.o -c" -mGLOB_cxx_limited_range=FALSE -mCG_extend_parms=FALSE -mGLOB_compiler_bin_directory=/opt/intel/compilers_and_libraries_2017.3.191/linux/bin/intel64 -mGLOB_as_output_backup_file_name=/tmp/icpcSuI5w8as_.s -mIPOPT_activate -mIPOPT_lite -mGLOB_instruction_tuning=0x0 -mGLOB_uarch_tuning=0x0 -mGLOB_product_id_code=0x22006d8e -mCG_bnl_movbe=T -mGLOB_extended_instructions=0x8000 -mGLOB_advanced_optim=TRUE -mP3OPT_use_mspp_call_convention -mP2OPT_subs_out_of_bound=FALSE -mP2OPT_disam_type_based_disam=2 -mP2OPT_disam_assume_ansi_c -mP2OPT_checked_disam_ansi_alias=TRUE -mGLOB_ansi_alias -mPGOPTI_value_profile_use=T -mP2OPT_il0_array_sections=TRUE -mGLOB_offload_mode=1 -mP2OPT_offload_unique_var_string=icpc1289335044CQUak5 -mP2OPT_hlo_level=2 -mP2OPT_hlo -mP2OPT_hpo_rtt_control=0 -mIPOPT_args_in_regs=0 -mP2OPT_disam_assume_nonstd_intent_in=FALSE -mGLOB_imf_mapping_library=/opt/intel/compilers_and_libraries_2017.3.191/linux/bin/intel64/libiml_attr.so -mPGOPTI_gen_threadsafe_level=0 -mIPOPT_lto_object_enabled -mIPOPT_lto_object_value=1 -mIPOPT_obj_output_file_name=testCXXCompiler.cxx.o -mIPOPT_whole_archive_fixup_file_name=/tmp/icpcwarchCdWFgm -mGLOB_linker_version=2.25.1 -mGLOB_driver_tempfile_name=/tmp/icpctempfileaSVN8X -mP3OPT_asm_target=P3OPT_ASM_TARGET_GAS -mGLOB_async_unwind_tables=TRUE -mGLOB_obj_output_file=testCXXCompiler.cxx.o -mGLOB_source_dialect=GLOB_SOURCE_DIALECT_C_PLUS_PLUS -mP1OPT_source_file_name=testCXXCompiler.cxx -mP1OPT_full_source_file_name=/home/jlefman/src/ITK/build-avx512-001/testCXXCompiler.cxx -mGLOB_eh_linux testCXXCompiler.cxx
#include "..." search starts here:
#include <...> search starts here:
 /opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/include
 /opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/include
 /opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/include
 /opt/intel/compilers_and_libraries_2017.3.191/linux/daal/include
 /opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/include
 /opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/include
 /opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/include
 /opt/intel/compilers_and_libraries_2017.3.191/linux/daal/include
 /opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/include/intel64
 /opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/include/icc
 /opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/include
 /usr/include/c++/4.8.5
 /usr/include/c++/4.8.5/x86_64-redhat-linux
 /usr/include/c++/4.8.5/backward
 /usr/local/include
 /usr/lib/gcc/x86_64-redhat-linux/4.8.5/include
 /usr/include/
 /usr/include
End of search list.

System info uname -a

Linux nodename.domain.com 3.10.0-327.el7.x86_64 #1 SMP Thu Nov 19 22:10:57 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

What is the cause of this issue? How can it be resolved?

Thank you.

Thread Topic:

Bug Report

↧

Intel C++ compiler for Intel Itanium architecture

May 5, 2017, 9:09 am

Latest and popular articles on Intel Technologies

≫ Next: internal error: assertion failed at: "shared/cfe/edgcpfe/folding.c"

≪ Previous: AVX-512 X code and icpc error #10236: File not found: ' '

*** Intel C++ compiler for Intel Itanium architecture ***

↧

internal error: assertion failed at: "shared/cfe/edgcpfe/folding.c"

May 5, 2017, 9:13 am

Latest and popular articles on Intel Technologies

≫ Next: improper UTF-8 characters handling by ICC on Windows

≪ Previous: Intel C++ compiler for Intel Itanium architecture

Hello,

From v16 update 3 on the following code generates an error, for code wich looks valid:

>icl main.cpp
Intel(R) C++ Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 17.0.2.187 Build 20170213
Copyright (C) 1985-2017 Intel Corporation.  All rights reserved.

main.cpp
main.cpp(8): internal error: assertion failed at: "shared/cfe/edgcpfe/folding.c", line 8623

        if (__has_assign(R))
                         ^

compilation aborted for main.cpp (code 4)

The code in question is:

#include <stdio.h>

struct R {
	void operator=(R& r) {}
};

int main() {
	if (__has_assign(R))
		printf("success\n");
	else
		printf("fail\n");

	return 0;
}

The problem is that with my project similar code comes from the product SDK headers to which we integrate and don't have control of.

Best regards,

Alexander

Attachment	Size
Download main.cpp	172 bytes

Zone:

Windows*

Thread Topic:

Bug Report

↧

improper UTF-8 characters handling by ICC on Windows

May 6, 2017, 2:27 am

Latest and popular articles on Intel Technologies

≫ Next: NUMA system detection

≪ Previous: internal error: assertion failed at: "shared/cfe/edgcpfe/folding.c"

Hello, everyone,

Fot ICU4C build using ICC on Windows got error:

sh-4.4$ (INSTALLDIR="$PWD/../../ICC64RH"&& (CC="icl" CFLAGS="-MD" CXX="icl" CXXFLAGS="-MD" LD="xilink" ./configure --prefix="$INSTALLDIR" --disable-debug --enable-release --enable-shared --disable-static >_configure.log && make >_make.log) 2>_stderr.log)
[snip]
sh-4.4$ make tests
[snip]
icl   -DHAVE_DLOPEN=0 -DU_HAVE_ATOMIC=1 -DU_HAVE_MMAP=0 -DU_HAVE_DIRENT_H=0 -DU_HAVE_POPEN=0 -DU_HAVE_STRTOD_L=0  -DU_RELEASE=1 -D_CRT_SECURE_NO_DEPRECATE -I. -I../../common -I../../i18n -I../../tools/toolutil -I../../tools/ctestfw -DUNISTR_FROM_CHAR_EXPLICIT= -DUNISTR_FROM_STRING_EXPLICIT= -DUCHAR_TYPE=char16_t -DU_ATTRIBUTE_DEPRECATED= -DWIN32 -DCYGWINMSVC -D'U_TOPSRCDIR="../../"' -D'U_TOPBUILDDIR="/c/libICU4C-59.1/build/source/"' -MD   -GF -nologo -EHsc -Zc:wchar_t -c   -Forbbitst.o rbbitst.cpp
rbbitst.cpp
rbbitst.cpp(1282): error: too many characters in character constant
              if (c == u'???') {
                       ^

compilation aborted for rbbitst.cpp (code 2)
make[2]: *** [../../config/mh-msys-msvc:142: rbbitst.o] Error 2
make[2]: *** Waiting for unfinished jobs....
make[2]: Leaving directory '/c/libICU4C-59.1/build/source/test/intltest'
make[1]: *** [Makefile:68: all-recursive] Error 2
make[1]: Leaving directory '/c/libICU4C-59.1/build/source/test'
make: *** [Makefile:213: tests] Error 2

which relate to code:

if (c == u'•') {

in the specified file and compiler option '-utf-8', appeared in ICU4C 59.1 MSVC toolchain.

ICC doesn't support this option and throw warnings:

icl: command line warning #10006: ignoring unknown option '/utf-8'

Similar error could be reproduced using MSVC, if ICU was built without '-utf-8' option:

sh-4.4$ (INSTALLDIR="$PWD/../../MSVC64RH"&& (CC="cl" CFLAGS="-MD" CXX="cl" CXXFLAGS="-MD" LD="link" ./configure --prefix="$INSTALLDIR" --disable-debug --enable-release --enable-shared --disable-static >_configure.log && make >_make.log) 2>_stderr.log)
[snip]
sh-4.4$ make tests
[snip]
make[2]: Entering directory '/c/libICU4C-59.1/build/source/test/intltest'
cl   -DHAVE_DLOPEN=0 -DU_HAVE_ATOMIC=1 -DU_HAVE_MMAP=0 -DU_HAVE_DIRENT_H=0 -DU_HAVE_POPEN=0 -DU_HAVE_TZNAME=0 -DU_HAVE_STRTOD_L=0  -DU_RELEASE=1 -D_CRT_SECURE_NO_DEPRECATE -I. -I../../common -I../../i18n -I../../tools/toolutil -I../../tools/ctestfw -DUNISTR_FROM_CHAR_EXPLICIT= -DUNISTR_FROM_STRING_EXPLICIT= -DUCHAR_TYPE=char16_t -DU_ATTRIBUTE_DEPRECATED= -DWIN32 -DCYGWINMSVC -D'U_TOPSRCDIR="../../"' -D'U_TOPBUILDDIR="/c/libICU4C-59.1/build/source/"' -MD   -GF -nologo -EHsc -Zc:wchar_t -c   -Forbbitst.o rbbitst.cpp
rbbitst.cpp
rbbitst.cpp(1282): error C2015: too many characters in constant
make[2]: *** [../../config/mh-msys-msvc:142: rbbitst.o] Error 2
make[2]: Leaving directory '/c/libICU4C-59.1/build/source/test/intltest'
make[1]: *** [Makefile:68: all-recursive] Error 2
make[1]: Leaving directory '/c/libICU4C-59.1/build/source/test'
make: *** [Makefile:213: tests] Error 2

As for mingw-w64, it doesn't reproduce this error even without '-fexec-charset= and -finput-charset=' options.

Using undocumented ICC option '-Qoption,cpp,"--uliterals"' (see threads ICC v13 Beta Questions, C++11, Intel Parallel Studio 13 XE & Unicode String Literals, etc.) doesn't solve the issue.

Since ICC on Windows imitates MSVC, is it possible to add '-utf-8' option support? It would make behavior of ICC on Windows regarding UTF-8 characters identical to MSVC.

Environment:

Windows 10 x64,
IPSXE 2017 Update 2,
VS 2015 Update 3,
Windows SDK 10.0.14393.33,
MSYS2 20161025,
ICU4C 59.1.

Alexander

Zone:

Windows*

↧

NUMA system detection

May 7, 2017, 10:16 am

Latest and popular articles on Intel Technologies

≫ Next: Using custom Clang version

≪ Previous: improper UTF-8 characters handling by ICC on Windows

Hello,

I was wondering what is the difference between a system being NUMA and a system having NUMA enabled? Moreover, how can I tell if the system is NUMA inside a compiler? Does being a NUMA system depends solely on the processor inside the system? Therefore, if I have processor X, than I can tell only based on this that the systems having such processors are NUMA systems?

Thank you,

Iulia

Zone:

Windows*

Thread Topic:

How-To

↧

Using custom Clang version

May 9, 2017, 6:11 am

Latest and popular articles on Intel Technologies

≫ Next: Assembly code generated by ICL for Windows

≪ Previous: NUMA system detection

Hello,

I am trying to upgrade a project to a newer compiler stack and language standard. I end up with some strange errors. I have this command line:

/opt/intel/compilers_and_libraries_2016.4.210/mac/bin/intel64/icpc -D_REENTRANT -DHAVE_GCCVISIBILITY -DMACOSX \
-mmacosx-version-min=10.11 -clangxx-name=/usr/local/llvm-3.9.1/bin/clang++ -std=c++11 -fno-omit-frame-pointer \
-fPIC -fvisibility=hidden -fmath-errno -qopenmp -fp-model precise -g -Wall -Wno-deprecated -pthread -o foo.o -c foo.cpp

It returns:

In file included from foo.cpp(9):
bar.hpp(16): catastrophic error: cannot open source file "iostream"
    #include <iostream>
                       ^
compilation aborted for foo.cpp (code 4)

It builds fine if built for macOS 10.8 and gnu++98.

Another problem that I am facing is that I can't get LLVM functionality, for example llvm::dyn_cast is unknown no matter what I include.

Has anybody faced similar problems?

Thread Topic:

Question

↧

Assembly code generated by ICL for Windows

May 9, 2017, 6:53 pm

Latest and popular articles on Intel Technologies

≫ Next: FMA not used

≪ Previous: Using custom Clang version

Hi,

I am trying to use ICL to generate assembly and then do some processing of assembly and finally generate the binary using the updated assembly. However, this workflow does not work because the assembly generated by ICL seems not acceptable by ML64.exe

Following is output of compiling a simple Hello World style program:

main() {  printf ("test\n"); }

Commands and outputs below:

test.c

test.asm
Assembling: test.asm
test.asm(18) : error A2008:syntax error : .1
test.asm(27) : error A2008:syntax error : .5
test.asm(35) : error A2008:syntax error : .2
test.asm(42) : error A2008:syntax error : .3
test.asm(50) : error A2008:syntax error : .main
test.asm(57) : error A2008:syntax error : .main
test.asm(58) : error A2006:undefined symbol : .B1
test.asm(59) : error A2006:undefined symbol : .unwind

Would this be fixable?

Thanks.

david

Zone:

Thread Topic:

Question

↧

FMA not used

May 10, 2017, 3:12 am

Latest and popular articles on Intel Technologies

≫ Next: Intel C++ compiler XE 12.1 for Visual Studio 2015 Professional

≪ Previous: Assembly code generated by ICL for Windows

Hi,

I have been surprised to spot the following behavior of Intel compiler (17.0.2 20170213 on Linux), using -xCORE-AVX2. The following code generates FMA instructions

double norm(double* x, int n) {
  ans = 0.0;
  for (int i = 0; i < n; ++i) {
    ans += x[i] * x[i];
  }
  return ans;
}

but the following code does not

float norm(float* x, int n) {
  ans = 0.0f;
  for (int i = 0; i < n; ++i) {
    ans += x[i] * x[i];
  }
  return ans;
}

Is there a reason for this, or is it a missed optimization form the compiler?

Best regards,

Francois

↧

Intel C++ compiler XE 12.1 for Visual Studio 2015 Professional

May 10, 2017, 9:11 am

Latest and popular articles on Intel Technologies

≫ Next: Internal error: assertion failed at: "shared/cfe/edgcpfe/decl_inits.c", line 588

≪ Previous: FMA not used

Hello,

I have some old code that need Intel C++ compiler XE 12.1 to compile. I have Visual Studio 2015 Professional on my laptop with Window 7 OS.

I installed the compiler XE 12.1, but during installation, it said some components cannot be installed because it requires a Microsoft development product (See attached picture). I installed it anyways, but VS 2015 cannot detect the compiler from tools.

For VS, I already have the "Common Tools for Visual C++ 2015" installed (which I think is needed). What else do I need?

Is it because VS 2015 dose not support this compiler XE 12.1? If so, is there any workaround so that I can compile my old code in VS 2015?

Thanks a lot in advance!

Thread Topic:

Help Me

↧

Internal error: assertion failed at: "shared/cfe/edgcpfe/decl_inits.c", line 588

May 10, 2017, 9:19 am

Latest and popular articles on Intel Technologies

≫ Next: CL in ICL VS2017 window doesn't find stdint.h

≪ Previous: Intel C++ compiler XE 12.1 for Visual Studio 2015 Professional

I am running into the following ICE with the following relatively minimalized test case.

Internal error: assertion failed at: "shared/cfe/edgcpfe/decl_inits.c", line 588

template <char>
struct Index {
  constexpr Index() : value_(0) {
  }

 private:
  int value_;
};

template <int>
int foo() {
  static constexpr Index<'i'> i;
  return 0;
}

int bar() {
  return foo<3>();
}

See versions below

[...]:~/test> gcc --version
gcc (GCC) 6.3.0 20161221 (Cray Inc.)
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

[...]:~/test> icpc --version
icpc (ICC) 17.0.2 20170213
Copyright (C) 1985-2017 Intel Corporation.  All rights reserved.

[...]:~/test> gcc -std=c++14 -c -o test.o test.cpp
[...]:~/test> icpc -std=c++14 -c -o test.o test.cpp
Internal error: assertion failed at: "shared/cfe/edgcpfe/decl_inits.c", line 588


compilation aborted for test.cpp (code 4)

This can be reproduced across lots of platforms and compiler versions, though this particular instance is in the Cray Linux Environment with PrgEnv-intel/6.0.3 and gcc/6.3.0

Thread Topic:

Bug Report

↧

CL in ICL VS2017 window doesn't find stdint.h

May 11, 2017, 10:44 am

Latest and popular articles on Intel Technologies

≫ Next: icl.exe exited with code 4 - invalid Microsoft version number: 89

≪ Previous: Internal error: assertion failed at: "shared/cfe/edgcpfe/decl_inits.c", line 588

C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2018.0.065\windows\c
ompiler\include\stdint.h(39): fatal error C1083: Cannot open include file: '../../vc/include/stdint.h': No such file or directory

This occurs with both XE2017u4 and XE2018 beta.

If the CL build step is run in the VS2017 developer prompt, the resulting .obj has no problem linking in the Intel cmd window for either VS2015 or 2017.

The CL build failure occurs only when VS2017 is set under the ICL cmd window. ICL doesn't have a problem.

Warnings about redefinition of HUGE_VAL.. occur both in the working and non-working cases. They show that math.h is being processed successfully although with warnings, along the same include path. Those redefinitions are present in the ICL math.h.

There may be something wrong with my setup, but I don't see it, other than the strange comment about looking for a posix style include path even though running under DOS style cmd window. That is actually the way the paths are written in the ICL installation of stdint.h.

Zone:

Windows*

Thread Topic:

Bug Report

↧