Quantcast
Channel: Intel® Software - Intel® C++ Compiler
Viewing all 2797 articles
Browse latest View live

unroll_and_jam pragma ignored but no reason specified

$
0
0

Hi,

Consider the following C++ code:

#include <malloc.h>
#include <cmath>
#include <complex>

int main(int argc, char **argv) {
    int N = 4000000;
    double * _arr_4_0;
  _arr_4_0 = (double *) (malloc((sizeof(double) * (unsigned long) (5331.0))));
  for (int _i0 = 0; (_i0 <= 5330); _i0 = (_i0 + 1))
  {
    _arr_4_0[_i0] = std::sin(_i0);
  }
    double * _arr_7_7;
  _arr_7_7 = (double *) (malloc((sizeof(double) * (unsigned long) (((0.1 * (double) (N)) + -66.0)))));
  #pragma omp parallel for schedule(static)
  #pragma ivdep
  for (int _i0 = 0; (_i0 < ((N / 10) - 66)); _i0 = (_i0 + 1))
  {
    _arr_7_7[_i0] = std::sqrt(_i0);
  }
    std::complex<double> * _arr_6_8;
  _arr_6_8 = (std::complex<double> *) (malloc((sizeof(std::complex<double>) * (unsigned long) (((0.1 * (double) (N)) + -5396.0)))));
  for (int o1 = 0; (o1 < (((N + 110) / 320) - 168)); o1 = (o1 + 1))
  {
    int _ct167 = ((((32 * o1) + 31) < ((N / 10) - 5397))? ((32 * o1) + 31): ((N / 10) - 5397));
    for (int o2 = (32 * o1); (o2 <= _ct167); o2 = (o2 + 1))
    {
      _arr_6_8[o2] = (0.0 + 0.0j);
    }
  }
  #pragma omp parallel for schedule(static)
  for (int o1 = 0; (o1 < (((N + 110) / 320) - 168)); o1 = (o1 + 1))
  {
    for (int o2 = 0; (o2 <= 166); o2 = (o2 + 1))
    {
      int _ct168 = ((((32 * o1) + 31) < ((N / 10) - 5397))? ((32 * o1) + 31): ((N / 10) - 5397));
      #pragma unroll_and_jam (6)
      for (int o3 = (32 * o1); (o3 <= _ct168); o3 = (o3 + 1))
      {
        int _ct169 = ((5330 < ((32 * o2) + 31))? 5330: ((32 * o2) + 31));
        #pragma ivdep
        for (int o4 = (32 * o2); (o4 <= _ct169); o4 = (o4 + 1))
        {
          _arr_6_8[o3] = (_arr_6_8[o3] + (_arr_7_7[((5330 - o4) + o3)] * _arr_4_0[o4]));
        }
      }
    }
  }
    return 0;
}

I compiled this using the following command (file saved as test.cpp):

icpc -O3 -qopenmp -qopt-report=5 -qopt-report-file=stdout test.cpp > optrpt

However, I get a warning on stderr which says:

test.cpp(38): (col. 7) remark: unroll_and_jam pragma will be ignored due to 

There is no reason specified for why the pragma is being ignored. Could you please help me diagnose this?

icpc -V

gives

Intel(R) C++ Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 17.0.1.132 Build 20161005
Copyright (C) 1985-2016 Intel Corporation.  All rights reserved.

This bug is also present on

Intel(R) C++ Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 17.0.2.174 Build 20170213
Copyright (C) 1985-2017 Intel Corporation.  All rights reserved.

Any suggestions on how to debug this would be appreciated.

Thanks,
Abhinav


improper PDB files handling by ICC on Windows

$
0
0

Hello, everyone,

Found, that ICC handles PDB files in somewhat different way than MSVC do, which could lead to errors during build. E.g. for OpenSSL builds using ICC on Windows got:
1. improper PDB files naming:

[run 'Compiler 17.0 Update 2 for Intel 64 Visual Studio 2015 environment' command && change into OpenSSL sources directory]
c:\libOPENSSL-1.1.1-dev\build>set CC=icl
c:\libOPENSSL-1.1.1-dev\build>perl Configure threads no-deprecated shared no-asm VC-WIN64A && nmake
[snip]"C:\ProgramData\Perl64\bin\perl.exe""-I." -Mconfigdata "util\dofile.pl""-omakefile""crypto\include\internal\bn_conf.h.in"> crypto\include\internal\bn_conf.h"C:\ProgramData\Perl64\bin\perl.exe""-I." -Mconfigdata "util\dofile.pl""-omakefile""crypto\include\internal\dso_conf.h.in"> crypto\include\internal\dso_conf.h"C:\ProgramData\Perl64\bin\perl.exe""-I." -Mconfigdata "util\dofile.pl""-omakefile""include\openssl\opensslconf.h.in"> include\openssl\opensslconf.h
        icl  /I "." /I "crypto\include" /I "include" -DOPENSSL_USE_APPLINK -DDSO_WIN32 -DNDEBUG -DOPENSSL_THREADS -DOPENSSL_NO_STATIC_ENGINE -DOPENSSL_PIC -DOPENSSL_API_COMPAT=0x10100000L "-DENGINESDIR=\"C:\\Program Files\\OpenSSL\\lib\\engines-1_1\"""-DOPENSSLDIR=\"C:\\Program Files\\Common Files\\SSL\"" -W3 -wd4090 -Gs0 -GF -Gy -nologo -DOPENSSL_SYS_WIN32 -DWIN32_LEAN_AND_MEAN -DL_ENDIAN -D_CRT_SECURE_NO_DEPRECATE -DUNICODE -D_UNICODE /MD /O2 /Zi /Fdossl_static -c /Focrypto\aes\aes_cbc.obj "crypto\aes\aes_cbc.c"
aes_cbc.c
error #31000: corrupt PDB file ossl_static\vc140.pdb; delete and rebuild; if problem persists, delete and try /Z7 instead
error code 3 ( can't write file, out of disk, etc.) opening pdb ossl_static\vc140.pdb
compilation aborted for crypto\aes\aes_cbc.c (code 1)
NMAKE : fatal error U1077: '"C:\Program Files (x86)\IntelSWTools\compilers_and_libraries\windows\bin\intel64\icl.EXE"' : return code '0x1'
Stop.

which relate to compiler keys
'/Fdossl_static', '/Fddso' and '/Fdapp'
in file 'Configurations/10-main.conf' from OpenSSL sources folder.

And if they are changed to
'/Fdossl_static.pdb', '/Fddso.pdb' and '/Fdapp.pdb'
before build, PDB files are created with proper names
'ossl_static.pdb', 'dso.pdb' and 'app.pdb',
and all 'nmake' tasks finishes successfully.

Error not reproduces for builds using MSVC. E.g. ICC, unlike MSVC, require explicitly specify '.pdb' extension of PDB files in '/Fd' keys. Otherwise ICC set PDB file names to '...\vc140.pdb'.

2. improper PDB files handling for parallel builds:
 

[extract 'jom' and 'nasm' executables to OpenSSL directory && run 'Compiler 17.0 Update 2 for Intel 64 Visual Studio 2015 environment' command && change into OpenSSL sources directory]
c:\libOPENSSL-1.1.1-dev\build>set CC=icl
c:\libOPENSSL-1.1.1-dev\build>perl Configure -QxAVX -fp:strict -Qprec -Zc:wchar_t -nologo -Qstd=c11 -O3 -MD threads no-deprecated shared VC-WIN64A && jom /J4
[snip]
        icl  /I "include" -DOPENSSL_USE_APPLINK -DDSO_WIN32 -DNDEBUG -DOPENSSL_THREADS -DOPENSSL_NO_STATIC_ENGINE -DOPENSSL_PIC -DOPENSSL_API_COMPAT=0x10100000L "-DENGINESDIR=\"C:\\Program Files\\OpenSSL\\lib\\engines-1_1\"""-DOPENSSLDIR=\"C:\\Program Files\\Common Files\\SSL\"" -W3 -wd4090 -Gs0 -GF -Gy -nologo -DOPENSSL_SYS_WIN32 -DWIN32_LEAN_AND_MEAN -DL_ENDIAN -D_CRT_SECURE_NO_DEPRECATE -DUNICODE -D_UNICODE /MD /O2  -QxAVX -fp:strict -Qprec -Zc:wchar_t -nologo -Qstd=c11 -O3 -MD /Zi /Fdapp.pdb -c /Fotest\buildtest_x509v3.obj "test\buildtest_x509v3.c"
icl: command line warning #10120: overriding '/O2' with '/O3'
buildtest_x509v3.c
        IF EXIST test\shlibloadtest.exe.manifest DEL /F /Q test\shlibloadtest.exe.manifest
error #31000: corrupt PDB file app.pdb; delete and rebuild; if problem persists, delete and try /Z7 instead
error code 3 ( can't write file, out of disk, etc.) opening pdb app.pdb
compilation aborted for test\buildtest_x509.c (code 1)
jom: C:\libOPENSSL-1.1.1-dev\build\Makefile [test\buildtest_x509.obj] Error 1
        link /nologo /debug /subsystem:console /opt:ref /out:test\shlibloadtest.exe @C:\Users\test\AppData\Local\Temp\shlibloadtest.exe.5752.221687.jom
        IF EXIST test\shlibloadtest.exe.manifest  mt -nologo -manifest test\shlibloadtest.exe.manifest -outputresource:test\shlibloadtest.exe
jom: C:\libOPENSSL-1.1.1-dev\build\Makefile [all] Error 2

Error reproduced for Release+Shared configuration if jobs number > 3 (other configurations require more; need be verified). Running 'nmake' or 'jom /J1' command from that point continue build and successfully finishes all tasks.

ICC has no appropriate compiler key for PDB files handling during parallel builds (similar to '/FS' for MSVC). Then assumed, that ICC dealing with this "things" by itself. In such case it turns out, that ICC has limitation on parallel jobs number for single PDB file processing, which MSVC hasn't.

Since ICC on Windows imitates MSVC, is it possible to make its behavior regarding PDB files identical to MSVC for both cases above?

Environment:

 

Alexander

 

Zone: 

Thread Topic: 

Bug Report

How can I solve this Read-After-Write dependency?

$
0
0

I'm trying to vectorize the inner `for` of this function:

    void SIFTDescriptor::samplePatch(float *vec)
    {
       for (int r = 0; r < par.patchSize; ++r)
       {
          const int br0 = par.spatialBins * bin0[r]; const float wr0 = w0[r];
          const int br1 = par.spatialBins * bin1[r]; const float wr1 = w1[r];
          for (int c = 0; c < par.patchSize; ++c)
          {
             const float val = mask.at<float>(r,c) * grad.at<float>(r,c);

             const int bc0 = bin0[c];
             const float wc0 = w0[c]*val;
             const int bc1 = bin1[c];
             const float wc1 = w1[c]*val;

             // ori from atan2 is in range <-pi,pi> so add 2*pi to be surely above zero
             const float o = float(par.orientationBins)*(ori.at<float>(r,c) + 2*M_PI)/(2*M_PI);

             int   bo0 = (int)o;
             const float wo1 =  o - bo0;
             bo0 %= par.orientationBins;

             int   bo1 = (bo0+1) % par.orientationBins;
             const float wo0 = 1.0f - wo1;

             // add to corresponding 8 vec...
             if (wr0*wc0>0) {
                 vec[br0+bc0+bo0] += wr0*wc0 * wo0;
                 vec[br0+bc0+bo1] += wr0*wc0 * wo1;
             }
             if (wr0*wc1>0) {
                 vec[br0+bc1+bo0] += wr0*wc1 * wo0;
                 vec[br0+bc1+bo1] += wr0*wc1 * wo1;
             }
             if (wr1*wc0>0) {
                 vec[br1+bc0+bo0] += wr1*wc0 * wo0;
                 vec[br1+bc0+bo1] += wr1*wc0 * wo1;
             }
             if (wr1*wc0>0) {
                 vec[br1+bc1+bo0] += wr1*wc0 * wo0;
                 vec[br1+bc1+bo1] += wr1*wc0 * wo1;
             }
          }
       }
    }

However, Intel Advisor tells me that there are two Read-After-Write dependencies in:

     vec[br0+bc0+bo0] += wr0*wc0 * wo0;

And:

     vec[br1+bc0+bo0] += wr1*wc0 * wo0;

How can I solve them? This is the first time that I try to solve such a dependency and I'm struggling a little bit...

OneCore aka Universal Windows Driver compatibility

$
0
0

Hi.

I want to build a simple "int main(){ return 0; }" program that passes the API Validator OneCore compatibility test.

After reading https://msdn.microsoft.com/en-us/windows/hardware/drivers/develop/getting-started-with-universal-drivers and https://msdn.microsoft.com/en-us/library/windows/desktop/mt654040(v=vs.85).aspx I proceeded to build a simple default solution in VS2015 u3 on my Windows 10 build 14393 system with the program described above.

If my platform toolset is "WindowsApplicationForDrivers10.0" or even "Visual Studio 2015 (v140)", running the API Validator that comes with the Windows SDK as a post-build event succeeds:

"C:\Program Files (x86)\Windows Kits\10\bin\x64\apivalidator.exe"  -driverpackagepath:"$(OutDir)\" -SupportedApiXmlFiles:"C:\Program Files (x86)\Windows Kits\10\build\universalDDIs\$(Platform)\UniversalDDIs.xml" -ModuleWhiteListXmlFiles:"C:\Program Files (x86)\Windows Kits\10\build\universalDDIs\$(Platform)\ModuleWhiteList.xml"

If I switch the Platform Toolset to Intel C++ Compiler 17.0 u2, I get warnings that my executable now has a dynamic dependency on ntdll.dll, which is not OneCore compliant. According to my investigations, this is because libirc.lib was linked with a dependency on ntlib.lib and the warnings appear for printf, vsnprintf, scanf, and variations of those functions.

Is it possible to generate a libirc.lib such that it ignores these default dependencies? This would allow the person building the executable or dll to choose the OneCore umbrella library (OneCore.lib) or the traditional desktop library (ntdll.lib) as the means to resolve those dependencies instead of having that decision being made in the libirc.lib.

Thank you.

Zone: 

OpenMP task depend pointers and structs

$
0
0

I have got the following code:

int foo(pm_t* t_lb){
  int i;
  int BLOCK=4;
#pragma omp task default(none) shared(t_lb, BLOCK)  private(i) \
  depend (inout: t_lb->a)

  {
    for (i=1;i < 1000 + BLOCK; i++){
      
      t_lb->a[i] =1.0f; 
    }
  }
}
int main(){
  
  pm_t* pm = NULL;
  pm = (pm_t*)calloc(1,sizeof(pm_t));
  
  
  pm->a = (double*)malloc(1000*sizeof(double));

#pragma omp parallel
  {
    /* Obtain thread number */
    int tid = omp_get_thread_num();
    printf("Hello World from thread = %d\n", tid);
    
#pragma omp master
    {
      foo(pm);
    }
  }
  
  return 0;
  
}

The structs:

typedef struct st pm_t;

struct st {
  double* a;
  double* t_b;
};

If this is compiled with Intel 17.0.2.174 I get:

tasks.c(10): error: invalid entity for this variable list in omp clause
    depend (inout: t_lb->a)

However I can compile it if I change the depend list to 

depend (inout: t_lb)

Is there a way to specify dependencies to the "a" pointer  instead to the whole structure "t_lb"?

 

Zone: 

ICC hangs on scipy 0.19.0 compilation of scipy/special/cython_special.c

$
0
0

ICC hangs on scipy 0.19.0 compilation of scipy/special/cython_special.c for 32 bit linux. Works fine on 64 bit linux, and 32 bit with gcc. The process just hangs and won't proceed. 

Compiling with numpy 0.12.0 and mkl/icc/ifort. Flags are 

Thread Topic: 

Bug Report

icl and icl++ command not found

$
0
0

Hello there,

I'm trying to use the clang compiler on a Mac OS* X using the Terminal. It turns out the commands icl and icl++ are not recognized. I setup the compiler variables with compilervars.sh. I have Xcode 8.3.2 and the Intel Compiler doesn't seem to be supported at this time. Can anybody help me figure this out?

Inefficient memory access pattern and irregular stride accesses

$
0
0

I don't know if this is the right section, I'm sorry in that case.

I'm trying to optimize this function:

bool interpolate(const Mat &im, float ofsx, float ofsy, float a11, float a12, float a21, float a22, Mat &res)
{
   bool ret = false;
   // input size (-1 for the safe bilinear interpolation)
   const int width = im.cols-1;
   const int height = im.rows-1;
   // output size
   const int halfWidth  = res.cols >> 1;
   const int halfHeight = res.rows >> 1;
   float *out = res.ptr<float>(0);
   const float *imptr  = im.ptr<float>(0);
   for (int j=-halfHeight; j<=halfHeight; ++j)
   {
      const float rx = ofsx + j * a12;
      const float ry = ofsy + j * a22;
      #pragma omp simd
      for(int i=-halfWidth; i<=halfWidth; ++i, out++)
      {
         float wx = rx + i * a11;
         float wy = ry + i * a21;
         const int x = (int) floor(wx);
         const int y = (int) floor(wy);
         if (x >= 0 && y >= 0 && x < width && y < height)
         {
            // compute weights
            wx -= x; wy -= y;
            int rowOffset = y*im.cols;
            int rowOffset1 = (y+1)*im.cols;
            // bilinear interpolation
            *out =
                (1.0f - wy) *
                ((1.0f - wx) *
                imptr[rowOffset+x] +
                wx *
                imptr[rowOffset+x+1]) +
                (       wy) *
                ((1.0f - wx) *
                imptr[rowOffset1+x] +
                wx *
                imptr[rowOffset1+x+1]);
         } else {
            *out = 0;
            ret =  true; // touching boundary of the input
         }
      }
   }
   return ret;
}

I'm using Intel Advisor to optimize it and even though the inner for has already been vectorized, Intel Advisor detected inefficient memory access patterns:

  • 60% of unit/zero stride access
  • 40% of irregular/random stride access

In particular there are 4 gather (irregular) access in the following three instructions:

The problem of gather access from my understanding happens when the accessed element is of the type a[b], where b is unpredictable. This seems to be the case with imptr[rowOffset+x], where both rowOffset and x are unpredictable.

At the same time, I see this Vertical Invariant which should happen (again, from my understanding) when elements are accessed with a constant offset. But actually I don't see where this constant offset

So I have 3 questions:

  1. Did I understood the problem of gather accesses correctly?
  2. What about the Vertical Invariant access? I'm less sure about this point.
  3. Finally, how can I improve/solve the memory access here?

Compiled with icpc 2017 update 3 with the following flags:

INTEL_OPT=-O3 -ipo -simd -xCORE-AVX2 -parallel -qopenmp -fargument-noalias -ansi-alias -no-prec-div -fp-model fast=2 -fma -align -finline-functions
INTEL_PROFILE=-g -qopt-report=5 -Bdynamic -shared-intel -debug inline-debug-info -qopenmp-link dynamic -parallel-source-info=2 -ldl

 

 

 


multiple openmp reduction on stl::vector gets wrong results

$
0
0

Hi,

In the following test code, I use omp declare reduction on stl::vector.  Using icpc 17.0, omp parallel reduction works correctly on a single stl::vector.  However, when I attempt to reduce to two separate vectors, v1 & v2, it returns v1=sum(v1)+sum(v2) and v2=0.0 (where sum is across threads).  By my reading of the standard, I would expect it to return v1=sum(v1) and v2=sum(v2).  This is what GCC 6.3 does.

Can my omp directives be tweaked to work with icpc 17.0?  Or is there another problem?

Thanks in advance,

Sean

(was advised to resubmit this post to C++ forum - original is here:  https://software.intel.com/en-us/forums/intel-moderncode-for-parallel-ar...)

// following on from http://stackoverflow.com/questions/43168661/openmp-and-reduction-on-stdvector
//
#include <algorithm>
#include <functional>
#include <iostream>
#include <vector>
#include <omp.h>

#pragma omp declare reduction(+ : std::vector<double> : std::transform(omp_in.begin(),omp_in.end(),omp_out.begin(),omp_out.begin(),std::plus<double>())) initializer (omp_priv=omp_orig)


int check1(int size){
  std::vector<double> result(size,0.0);

  int npart=100;
  int n;
  {
    int i;
// Works in GCC 6.3, Intel 17.0:
#pragma omp parallel for private(i,n) reduction(+:result)
    for (n=0; n<npart; n++){
      for (i=0; i<size; i++){
	result[i]  += i;
      }
    }
  }
  int fail=0;
  for (int i=0; i<size; i++) {
    if (result[i] != i*100){
      fail=1;
    }
    std::cout <<"i="<< i<<""<<result[i]<<std::endl;
  }
  return fail;
}

int check2(int size){
  std::vector<double> result(size,0.0);
  std::vector<double> result2(size,0.0);

  int npart=100;
  int n;
  {
    int i;
// Works in GCC 6.3, fails in Intel 17.0:
//#pragma omp parallel for private(i,n) reduction(+:result),reduction(+:result2)
// Works in GCC 6.3, fails in Intel 17.0:
#pragma omp parallel for private(i,n) reduction(+:result,result2)
    for (n=0; n<npart; n++){
      for (i=0; i<size; i++){
	result[i]  += i;
	result2[i] += i;
      }
    }
  }
  int fail=0;
  for (int i=0; i<size; i++) {
    if (result[i] != i*100 || result2[i] != i*100){
      fail=1;
    }
    std::cout <<"i="<< i<<""<<result[i]<<""<<result2[i]<<std::endl;
  }
  return fail;
}



int main(int argc, char *argv[]) {

  int size;

  if (argc < 2)
    size = 10;
  else
    size = atoi(argv[1]);

  int fail1=check1(size);
  int fail2=check2(size);

  return fail1+fail2;
}

 

Thread Topic: 

Bug Report

AVX-512 X code and icpc error #10236: File not found: ' '

$
0
0

Hi,

I am attempting to compile the following test code with ICC 17.0.3 20170404 contained in file name testCXXCompiler.cxx:

int main(){return 0;}

The command line I am using is: 

/opt/intel/compilers_and_libraries_2017.3.191/linux/bin/intel64/icpc -xCORE-AVX512    -o testCXXCompiler.cxx.o -c testCXXCompiler.cxx

My machine is below:

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                68
On-line CPU(s) list:   0-67
Thread(s) per core:    1
Core(s) per socket:    68
Socket(s):             1
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 87
Model name:            Intel(R) Xeon Phi(TM) CPU 7250 @ 1.40GHz
Stepping:              1
CPU MHz:               998.101
BogoMIPS:              2793.41
L1d cache:             32K
L1i cache:             32K
L2 cache:              1024K
NUMA node0 CPU(s):     0-67
NUMA node1 CPU(s):

The error is:

icpc: error #10236: File not found:  ''

Using the compiler flag -v, I get this:

icpc: error #10236: File not found:  ' '
icpc version 17.0.3 (gcc version 4.8.5 compatibility)
/opt/intel/compilers_and_libraries_2017.3.191/linux/bin/intel64/mcpcom    --target_efi2 --lang=c++ -_g -mP3OPT_inline_alloca -D__ICC=1700 -D__INTEL_COMPILER=1700 -D__INTEL_COMPILER_UPDATE=3 -D__PTRDIFF_TYPE__=long "-D__SIZE_TYPE__=unsigned long" -D__WCHAR_TYPE__=int "-D__WINT_TYPE__=unsigned int""-D__INTMAX_TYPE__=long int""-D__UINTMAX_TYPE__=long unsigned int" -D__LONG_MAX__=9223372036854775807L -D__QMSPP_ -D__OPTIMIZE__ -D__NO_MATH_INLINES -D__NO_STRING_INLINES -D__GNUC_GNU_INLINE__ -D__GNUG__=4 -D__GNUC__=4 -D__GNUC_MINOR__=8 -D__GNUC_PATCHLEVEL__=5 -D__LP64__ -D_LP64 -D_GNU_SOURCE=1 -D__DEPRECATED=1 -D__GXX_WEAK__=1 -D__GXX_ABI_VERSION=1002 "-D__USER_LABEL_PREFIX__= " -D__REGISTER_PREFIX__= -D__INTEL_RTTI__ -D__EXCEPTIONS=1 -D__unix__ -D__unix -D__linux__ -D__linux -D__gnu_linux__ -B -Dunix -Dlinux "-_Asystem(unix)" -D__ELF__ -D__x86_64 -D__x86_64__ -D__amd64 -D__amd64__ "-_Acpu(x86_64)""-_Amachine(x86_64)" -D__INTEL_COMPILER_BUILD_DATE=20170404 -D__INTEL_OFFLOAD -D__pentium4 -D__pentium4__ -D__tune_pentium4__ -D__SSE2__ -D__SSE2_MATH__ -D__SSE3__ -D__SSSE3__ -D__SSE4_1__ -D__SSE4_2__ -D__SSE__ -D__SSE_MATH__ -D__MMX__ -D__AVX__ -D__AVX_I__ -D__AVX2__ -D__FMA__ -D__BETA_BDW__ -D__AVX512F__ -D__AVX512CD__ -D__AVX512DQ__ -D__AVX512BW__ -D__AVX512VL__ -_k -_8 -_l --has_new_stdarg_support -_a -_b --gnu_version=40805 -_W5 --gcc-extern-inline -p --bool -tused -x --multibyte_chars -mGLOB_diag_suppress_sys --array_section --simd --simd_func --offload_mode=1 --offload_target_names=gfx,GFX,mic,MIC --offload_unique_string=icpc1289335044CQUak5 --bool -mGLOB_em64t=TRUE -mP1OPT_version=17.0-intel64 -mGLOB_diag_file=testCXXCompiler.cxx.diag -mGLOB_long_size_64 -mGLOB_routine_pointer_size_64 -mP1OPT_print_version=FALSE -mCG_use_gas_got_workaround=F -mP2OPT_align_option_used=TRUE -mGLOB_gcc_version=485 "-mGLOB_options_string=-xCORE-AVX512 -v -o testCXXCompiler.cxx.o -c" -mGLOB_cxx_limited_range=FALSE -mCG_extend_parms=FALSE -mGLOB_compiler_bin_directory=/opt/intel/compilers_and_libraries_2017.3.191/linux/bin/intel64 -mGLOB_as_output_backup_file_name=/tmp/icpcSuI5w8as_.s -mIPOPT_activate -mIPOPT_lite -mGLOB_instruction_tuning=0x0 -mGLOB_uarch_tuning=0x0 -mGLOB_product_id_code=0x22006d8e -mCG_bnl_movbe=T -mGLOB_extended_instructions=0x8000 -mGLOB_advanced_optim=TRUE -mP3OPT_use_mspp_call_convention -mP2OPT_subs_out_of_bound=FALSE -mP2OPT_disam_type_based_disam=2 -mP2OPT_disam_assume_ansi_c -mP2OPT_checked_disam_ansi_alias=TRUE -mGLOB_ansi_alias -mPGOPTI_value_profile_use=T -mP2OPT_il0_array_sections=TRUE -mGLOB_offload_mode=1 -mP2OPT_offload_unique_var_string=icpc1289335044CQUak5 -mP2OPT_hlo_level=2 -mP2OPT_hlo -mP2OPT_hpo_rtt_control=0 -mIPOPT_args_in_regs=0 -mP2OPT_disam_assume_nonstd_intent_in=FALSE -mGLOB_imf_mapping_library=/opt/intel/compilers_and_libraries_2017.3.191/linux/bin/intel64/libiml_attr.so -mPGOPTI_gen_threadsafe_level=0 -mIPOPT_lto_object_enabled -mIPOPT_lto_object_value=1 -mIPOPT_obj_output_file_name=testCXXCompiler.cxx.o -mIPOPT_whole_archive_fixup_file_name=/tmp/icpcwarchCdWFgm -mGLOB_linker_version=2.25.1 -mGLOB_driver_tempfile_name=/tmp/icpctempfileaSVN8X -mP3OPT_asm_target=P3OPT_ASM_TARGET_GAS -mGLOB_async_unwind_tables=TRUE -mGLOB_obj_output_file=testCXXCompiler.cxx.o -mGLOB_source_dialect=GLOB_SOURCE_DIALECT_C_PLUS_PLUS -mP1OPT_source_file_name=testCXXCompiler.cxx -mP1OPT_full_source_file_name=/home/jlefman/src/ITK/build-avx512-001/testCXXCompiler.cxx -mGLOB_eh_linux testCXXCompiler.cxx
#include "..." search starts here:
#include <...> search starts here:
 /opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/include
 /opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/include
 /opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/include
 /opt/intel/compilers_and_libraries_2017.3.191/linux/daal/include
 /opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/include
 /opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/include
 /opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/include
 /opt/intel/compilers_and_libraries_2017.3.191/linux/daal/include
 /opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/include/intel64
 /opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/include/icc
 /opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/include
 /usr/include/c++/4.8.5
 /usr/include/c++/4.8.5/x86_64-redhat-linux
 /usr/include/c++/4.8.5/backward
 /usr/local/include
 /usr/lib/gcc/x86_64-redhat-linux/4.8.5/include
 /usr/include/
 /usr/include
End of search list.

System info uname -a

Linux nodename.domain.com 3.10.0-327.el7.x86_64 #1 SMP Thu Nov 19 22:10:57 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

What is the cause of this issue? How can it be resolved?

Thank you.

Thread Topic: 

Bug Report

Intel C++ compiler for Intel Itanium architecture

$
0
0

*** Intel C++ compiler for Intel Itanium architecture ***

internal error: assertion failed at: "shared/cfe/edgcpfe/folding.c"

$
0
0

Hello,

From v16 update 3 on the following code generates an error, for code wich looks valid:

>icl main.cpp
Intel(R) C++ Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 17.0.2.187 Build 20170213
Copyright (C) 1985-2017 Intel Corporation.  All rights reserved.

main.cpp
main.cpp(8): internal error: assertion failed at: "shared/cfe/edgcpfe/folding.c", line 8623

        if (__has_assign(R))
                         ^

compilation aborted for main.cpp (code 4)

 

The code in question is:

#include <stdio.h>

struct R {
	void operator=(R& r) {}
};

int main() {
	if (__has_assign(R))
		printf("success\n");
	else
		printf("fail\n");

	return 0;
}

The problem is that with my project similar code comes from the product SDK headers to which we integrate and don't have control of.

Best regards,

Alexander

AttachmentSize
Downloadtext/x-c++srcmain.cpp172 bytes

Zone: 

Thread Topic: 

Bug Report

improper UTF-8 characters handling by ICC on Windows

$
0
0

Hello, everyone,

Fot ICU4C build using ICC on Windows got error:

sh-4.4$ (INSTALLDIR="$PWD/../../ICC64RH"&& (CC="icl" CFLAGS="-MD" CXX="icl" CXXFLAGS="-MD" LD="xilink" ./configure --prefix="$INSTALLDIR" --disable-debug --enable-release --enable-shared --disable-static >_configure.log && make >_make.log) 2>_stderr.log)
[snip]
sh-4.4$ make tests
[snip]
icl   -DHAVE_DLOPEN=0 -DU_HAVE_ATOMIC=1 -DU_HAVE_MMAP=0 -DU_HAVE_DIRENT_H=0 -DU_HAVE_POPEN=0 -DU_HAVE_STRTOD_L=0  -DU_RELEASE=1 -D_CRT_SECURE_NO_DEPRECATE -I. -I../../common -I../../i18n -I../../tools/toolutil -I../../tools/ctestfw -DUNISTR_FROM_CHAR_EXPLICIT= -DUNISTR_FROM_STRING_EXPLICIT= -DUCHAR_TYPE=char16_t -DU_ATTRIBUTE_DEPRECATED= -DWIN32 -DCYGWINMSVC -D'U_TOPSRCDIR="../../"' -D'U_TOPBUILDDIR="/c/libICU4C-59.1/build/source/"' -MD   -GF -nologo -EHsc -Zc:wchar_t -c   -Forbbitst.o rbbitst.cpp
rbbitst.cpp
rbbitst.cpp(1282): error: too many characters in character constant
              if (c == u'???') {
                       ^

compilation aborted for rbbitst.cpp (code 2)
make[2]: *** [../../config/mh-msys-msvc:142: rbbitst.o] Error 2
make[2]: *** Waiting for unfinished jobs....
make[2]: Leaving directory '/c/libICU4C-59.1/build/source/test/intltest'
make[1]: *** [Makefile:68: all-recursive] Error 2
make[1]: Leaving directory '/c/libICU4C-59.1/build/source/test'
make: *** [Makefile:213: tests] Error 2

which relate to code:

if (c == u'•') {

in the specified file and compiler option '-utf-8', appeared in ICU4C 59.1 MSVC toolchain.

ICC doesn't support this option and throw warnings:

icl: command line warning #10006: ignoring unknown option '/utf-8'

Similar error could be reproduced using MSVC, if ICU was built without '-utf-8' option:

sh-4.4$ (INSTALLDIR="$PWD/../../MSVC64RH"&& (CC="cl" CFLAGS="-MD" CXX="cl" CXXFLAGS="-MD" LD="link" ./configure --prefix="$INSTALLDIR" --disable-debug --enable-release --enable-shared --disable-static >_configure.log && make >_make.log) 2>_stderr.log)
[snip]
sh-4.4$ make tests
[snip]
make[2]: Entering directory '/c/libICU4C-59.1/build/source/test/intltest'
cl   -DHAVE_DLOPEN=0 -DU_HAVE_ATOMIC=1 -DU_HAVE_MMAP=0 -DU_HAVE_DIRENT_H=0 -DU_HAVE_POPEN=0 -DU_HAVE_TZNAME=0 -DU_HAVE_STRTOD_L=0  -DU_RELEASE=1 -D_CRT_SECURE_NO_DEPRECATE -I. -I../../common -I../../i18n -I../../tools/toolutil -I../../tools/ctestfw -DUNISTR_FROM_CHAR_EXPLICIT= -DUNISTR_FROM_STRING_EXPLICIT= -DUCHAR_TYPE=char16_t -DU_ATTRIBUTE_DEPRECATED= -DWIN32 -DCYGWINMSVC -D'U_TOPSRCDIR="../../"' -D'U_TOPBUILDDIR="/c/libICU4C-59.1/build/source/"' -MD   -GF -nologo -EHsc -Zc:wchar_t -c   -Forbbitst.o rbbitst.cpp
rbbitst.cpp
rbbitst.cpp(1282): error C2015: too many characters in constant
make[2]: *** [../../config/mh-msys-msvc:142: rbbitst.o] Error 2
make[2]: Leaving directory '/c/libICU4C-59.1/build/source/test/intltest'
make[1]: *** [Makefile:68: all-recursive] Error 2
make[1]: Leaving directory '/c/libICU4C-59.1/build/source/test'
make: *** [Makefile:213: tests] Error 2

As for mingw-w64, it doesn't reproduce this error even without '-fexec-charset= and -finput-charset=' options.

Using undocumented ICC option '-Qoption,cpp,"--uliterals"' (see threads ICC v13 Beta Questions, C++11, Intel Parallel Studio 13 XE & Unicode String Literals, etc.) doesn't solve the issue.

 

Since ICC on Windows imitates MSVC, is it possible to add '-utf-8' option support? It would make behavior of ICC on Windows regarding UTF-8 characters identical to MSVC.

 

Environment:

  • Windows 10 x64,
  • IPSXE 2017 Update 2,
  • VS 2015 Update 3,
  • Windows SDK 10.0.14393.33,
  • MSYS2 20161025,
  • ICU4C 59.1.

 

Alexander

 

Zone: 

NUMA system detection

$
0
0

Hello,

I was wondering what is the difference between a system being NUMA and a system having NUMA enabled? Moreover, how can I tell if the system is NUMA inside a compiler? Does being a NUMA system depends solely on the processor inside the system? Therefore, if I have processor X, than I can tell only based on this that the systems having such processors are NUMA systems?

Thank you,

Iulia

Zone: 

Thread Topic: 

How-To

Using custom Clang version

$
0
0

Hello,

I am trying to upgrade a project to a newer compiler stack and language standard. I end up with some strange errors. I have this command line:

/opt/intel/compilers_and_libraries_2016.4.210/mac/bin/intel64/icpc -D_REENTRANT -DHAVE_GCCVISIBILITY -DMACOSX \
-mmacosx-version-min=10.11 -clangxx-name=/usr/local/llvm-3.9.1/bin/clang++ -std=c++11 -fno-omit-frame-pointer \
-fPIC -fvisibility=hidden -fmath-errno -qopenmp -fp-model precise -g -Wall -Wno-deprecated -pthread -o foo.o -c foo.cpp

It returns:

In file included from foo.cpp(9):
bar.hpp(16): catastrophic error: cannot open source file "iostream"
    #include <iostream>
                       ^
compilation aborted for foo.cpp (code 4)

It builds fine if built for macOS 10.8 and gnu++98.

Another problem that I am facing is that I can't get LLVM functionality, for example llvm::dyn_cast is unknown no matter what I include.

Has anybody faced similar problems?

Thread Topic: 

Question

Assembly code generated by ICL for Windows

$
0
0

Hi,

I am trying to use ICL to generate assembly and then do some processing of assembly and finally generate the binary using the updated assembly.  However, this workflow does not work because the assembly generated by ICL seems not acceptable by ML64.exe

Following is output of compiling a simple Hello World style program:

main() {  printf ("test\n"); }

Commands and outputs below:

$ icl /S test.c
Intel(R) C++ Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 17.0.2.187 Build 20170213
Copyright (C) 1985-2017 Intel Corporation.  All rights reserved.

test.c

$ icl test.asm
Intel(R) C++ Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 17.0.2.187 Build 20170213
Copyright (C) 1985-2017 Intel Corporation.  All rights reserved.

test.asm
 Assembling: test.asm
test.asm(18) : error A2008:syntax error : .1
test.asm(27) : error A2008:syntax error : .5
test.asm(35) : error A2008:syntax error : .2
test.asm(42) : error A2008:syntax error : .3
test.asm(50) : error A2008:syntax error : .main
test.asm(57) : error A2008:syntax error : .main
test.asm(58) : error A2006:undefined symbol : .B1
test.asm(59) : error A2006:undefined symbol : .unwind

Would this be fixable?

Thanks.

david

Zone: 

Thread Topic: 

Question

FMA not used

$
0
0

Hi,

I have been surprised to spot the following behavior of Intel compiler (17.0.2 20170213 on Linux), using -xCORE-AVX2. The following code generates FMA instructions

double norm(double* x, int n) {
  ans = 0.0;
  for (int i = 0; i < n; ++i) {
    ans += x[i] * x[i];
  }
  return ans;
}

but the following code does not

float norm(float* x, int n) {
  ans = 0.0f;
  for (int i = 0; i < n; ++i) {
    ans += x[i] * x[i];
  }
  return ans;
}

Is there a reason for this, or is it a missed optimization form the compiler?

Best regards,

Francois

Intel C++ compiler XE 12.1 for Visual Studio 2015 Professional

$
0
0

Hello,

I have some old code that need Intel C++ compiler XE 12.1 to compile. I have Visual Studio 2015 Professional on my laptop with Window 7 OS.

I installed the compiler XE 12.1, but during installation, it said some components cannot be installed because it requires a Microsoft development product (See attached picture). I installed it anyways, but VS 2015 cannot detect the compiler from tools.

For VS, I already have the "Common Tools for Visual C++ 2015" installed (which I think is needed). What else do I need?

Is it because VS 2015 dose not support this compiler XE 12.1? If so, is there any workaround so that I can compile my old code in VS 2015?

Thanks a lot in advance!

 

Thread Topic: 

Help Me

Internal error: assertion failed at: "shared/cfe/edgcpfe/decl_inits.c", line 588

$
0
0

I am running into the following ICE with the following relatively minimalized test case.

Internal error: assertion failed at: "shared/cfe/edgcpfe/decl_inits.c", line 588

template <char>
struct Index {
  constexpr Index() : value_(0) {
  }

 private:
  int value_;
};

template <int>
int foo() {
  static constexpr Index<'i'> i;
  return 0;
}

int bar() {
  return foo<3>();
}

See versions below
 

[...]:~/test> gcc --version
gcc (GCC) 6.3.0 20161221 (Cray Inc.)
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

[...]:~/test> icpc --version
icpc (ICC) 17.0.2 20170213
Copyright (C) 1985-2017 Intel Corporation.  All rights reserved.

[...]:~/test> gcc -std=c++14 -c -o test.o test.cpp
[...]:~/test> icpc -std=c++14 -c -o test.o test.cpp
Internal error: assertion failed at: "shared/cfe/edgcpfe/decl_inits.c", line 588


compilation aborted for test.cpp (code 4)

This can be reproduced across lots of platforms and compiler versions, though this particular instance is in the Cray Linux Environment with PrgEnv-intel/6.0.3 and gcc/6.3.0

Thread Topic: 

Bug Report

CL in ICL VS2017 window doesn't find stdint.h

$
0
0

C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2018.0.065\windows\c
ompiler\include\stdint.h(39): fatal error C1083: Cannot open include file: '../../vc/include/stdint.h': No such file or directory

This occurs with both XE2017u4 and XE2018 beta.

If the CL build step is run in the VS2017 developer prompt, the resulting .obj has no problem linking in the Intel cmd window for either VS2015 or 2017.

The CL build failure occurs only when VS2017 is set under the ICL cmd window.  ICL doesn't have a problem.

Warnings about redefinition of HUGE_VAL.. occur both in the working and non-working cases.  They show that math.h is being processed successfully although with warnings, along the same include path.  Those redefinitions are present in the ICL math.h.

There may be something wrong with my setup, but I don't see it, other than the strange comment about looking for a posix style include path even though running under DOS style cmd window. That is actually the way the paths are written in the ICL installation of stdint.h.

Zone: 

Thread Topic: 

Bug Report
Viewing all 2797 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>