PETSc with 2019 version

February 7, 2019, 11:38 pm

Latest and popular articles on Intel Technologies

≫ Next: Debian Buster - can't compile simple program

≪ Previous: icc 17.0 can't compile immintrin.h header from gcc 4.4.7

Dear All,

I wanted to upgrade to the newest parallel studio XE cluster edition and have installed 2019 update 2.

Installation worked, but now I have a lot of problems compiling the libraries I need:

When compiling MPICH, I get

checking size of bool... 0
configure: error: unable to determine matching C type for C++ bool

For FFTW, I also had similar with some definitions (https://software.intel.com/en-us/forums/intel-c-compiler/topic/804830) and also with PETSc (using Intel MPI because of the issues with MPICH) the Linker fails.

I am using Ubuntu 18.04. Are there any changes in the behavior of the 2019 compiler versions with respect to gcc compatibility or the use of Linux headers?

↧

Debian Buster - can't compile simple program

February 10, 2019, 12:36 pm

Latest and popular articles on Intel Technologies

≫ Next: Averaging Filter SDLT Sample

≪ Previous: Compiling MPICH/FFTW/PETSc with 2019 version

I was able to install it but I can't compile a simple hello world program.

$ ls
test.cpp
$ cat test.cpp 
#include <iostream>

int main() {
	std::cout << "Hello world"<< std::endl;
	return 0;
}
$ icpc test.cpp 
In file included from test.cpp(1):
/usr/include/c++/8/iostream(38): catastrophic error: cannot open source file "bits/c++config.h"
  #include <bits/c++config.h>
                             ^

compilation aborted for test.cpp (code 4)
$ icpc -v
icpc version 19.0.1.144 (gcc version 8.2.0 compatibility)
$ gcc -v
Using built-in specs.
COLLECT_GCC=/usr/bin/gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/8/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 8.2.0-16' --with-bugurl=file:///usr/share/doc/gcc-8/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-8 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 8.2.0 (Debian 8.2.0-16)

↧

Averaging Filter SDLT Sample

February 15, 2019, 2:51 am

Latest and popular articles on Intel Technologies

≫ Next: Cannot convert non-capturing lambda closure to function pointer using the unary + operator

≪ Previous: Debian Buster - can't compile simple program

Hi all

I am not able to understand the output we receive from the Averaging filer

with SDLT , the output is:

The time taken in number of ticks is 774874281
The time taken just for filter operation is 55768898

And with Serial Sample, output is much better:

The time taken in number of ticks is 766142320
The time taken just for filter operation is 51793809

It is clear from the response that Serial Sample is performing better, so why SDLT?

Can you direct me to correct and detailed doc or explain me, why someone should prefer Intel SDLT? Even i do not see any difference in output image.

Thank you

↧

Cannot convert non-capturing lambda closure to function pointer using the unary + operator

February 16, 2019, 3:34 am

Latest and popular articles on Intel Technologies

≫ Next: assertion failed: reconcile_routine_types: shared param types unexpected (shared/cfe/edgcpfe/decls.c, line 5524)

≪ Previous: Averaging Filter SDLT Sample

I've come up with an issue when trying to use some generic code involving lambda functions. Here it is a minimal code to reproduce it:

#include <iostream>
int main() {
    auto f = [] { std::cout << "this is a test"<< std::endl; };
    void(*fp0)() = f; // OK
    auto fp1 = +f;    // Does not compile
    fp0();
    fp1();
    return 0;
}

And this is the error the compiler raises:

error: more than one conversion function from "lambda []()->void" to a built-in type applies:
function "lambda []()->void::operator void (*)()() const"
function "lambda []()->void::operator void (*)()() const"
auto fp1 = +f; // Does not compile

The main problem with this bug is that it makes imposible to convert lambdas to regular function pointers in a generic context when you don't know the signature beforehand. I've tested the given code with another compilers and it just works. Currently I'm using Intel Parallel Studio XE 2019 Update 2.

↧

assertion failed: reconcile_routine_types: shared param types unexpected (shared/cfe/edgcpfe/decls.c, line 5524)

January 29, 2019, 4:28 am

Latest and popular articles on Intel Technologies

≫ Next: intel v19 compilation failure - tensorflow-1.12.0

≪ Previous: Cannot convert non-capturing lambda closure to function pointer using the unary + operator

Hello!

Here is minimal working example, where combination of typedefed templated method type with static keyword results into compilation failure:

template<typename T>
using func_t = void (T x);

class MyClass
{
public:
template<typename T>
    static func_t<T> method;
};

template<typename T>
void MyClass::method(T x)
{
}

Namely, I am getting following error:

internal error: assertion failed: reconcile_routine_types: shared param types unexpected (shared/cfe/edgcpfe/decls.c, line 5524)"

If I simply substitute templated typedef, compilation works fine. Presented example compiles with other compilers (gnu) but fails with Intel 2015, 2016 2017 and 2018 compilers with -std=c++11 flag.

↧

intel v19 compilation failure - tensorflow-1.12.0

February 19, 2019, 11:10 pm

Latest and popular articles on Intel Technologies

≫ Next: The license file(s) you provided are not valid for this product

≪ Previous: assertion failed: reconcile_routine_types: shared param types unexpected (shared/cfe/edgcpfe/decls.c, line 5524)

Hi,
I was able to compile tensorflowv1.12.0 (CPU version) with gcc 4.8.5 (system default) from source, and i am currently having issues while compiling tensorflow with intel. Here are some details wrt. build environment -

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): CentOS Linux release 7.6.1810 (Core)
Python version: 3.6.8
Bazel version : 0.19.1
GCC/Compiler version : gcc 4.8.5 + intel/19.0.0.117
CUDA/cuDNN version: NA (CPU Build)
GPU model and memory: NA (CPU build)

Primary reason for rebuilding with intel/19.0.0.117 is to compare relative performance gain on some sample TF benchmarks.

tensorflow-1.12.0]$ icc --version
icc (ICC) 19.0.0.117 20180804
Copyright (C) 1985-2018 Intel Corporation.  All rights reserved.

The command used for compilation is as follows -

CC=icc bazel build --config=mkl --define=grpc_no_ares=true --copt=-xHOST 
//tensorflow/tools/pip_package:build_pip_package

Error messages with aforementioned command line can be found in build_logs.txt & build_logs2.txt

Since bazel seems to be using parallel make (make -j N) type biild, I tried "make"ing tensorflow in "serial mode" as -

CC=icc bazel build -s --config=mkl --jobs=1 --define=grpc_no_ares=true --copt=-xHOST 
//tensorflow/tools/pip_package:build_pip_package

compilation logs (for --jobs=1) are in file build_logs3_serial1. Please let me know if any further information is required from my end.
Eagerly awaiting your replies.

Attachment	Size
Download build_logs3_serial1.txt	4.81 MB
Download build_logs2.txt	1.66 MB
Download build_logs.txt	4.86 MB

↧

The license file(s) you provided are not valid for this product

February 21, 2019, 7:36 am

Latest and popular articles on Intel Technologies

≫ Next: Missing post after editing title

≪ Previous: intel v19 compilation failure - tensorflow-1.12.0

I downloaded a fresh copy of the license file.

I downloaded parallel_studio_xe_2018_update4_professional_edition.tgz

unzipped and untarred it and cd'ed to the directory

tried to run install_gui.sh

It checks the tar files, then asks about activation.

I chose "Chose alternative activation" as I have the license file. (BTW, there is no internet connection for this server...)

I then chose "activate offline".

I browsed to the license file.

Clicked "next" and got the "The license file(s) you provided are not valid for this product" error.

Can you help with this install?

thanks

↧

Missing post after editing title

February 21, 2019, 8:59 am

Latest and popular articles on Intel Technologies

≫ Next: difference in vectorization with different versions of Intel compiler

≪ Previous: The license file(s) you provided are not valid for this product

After editing a post title to note that it was reporting a bug, it disappeared from the posts list and therefore is no longer accessible without its direct link.

The post I'm referring to is this:

[Bug report] Cannot convert non-capturing lambda closure to function pointer using the unary + operator

The post was published last Saturday, and the edition of its title the following Monday. I haven't received any messages about any extra need of moderation in this regard so I think there has to be any kind of problem.

I wish it could reappear in the list so it can get any attention, thanks.

↧

difference in vectorization with different versions of Intel compiler

February 20, 2019, 4:40 pm

Latest and popular articles on Intel Technologies

≫ Next: How to use Compiler 17 from Parallel Studio

≪ Previous: Missing post after editing title

Hi all,

I have a complex code that calls many Fortran and MKL functions which I need to optimize it. As the first step, I've started with compiler optimization -O 3. I've compiled my code once with MKL and the Intel compilers version 14 and another time with version 16. I used the verbose flag to get more information and I found that each version can vectorize a different set of loops. Although v14 can vectorize more loops, the compiled code with v16 performed better because more critical loops were vectorized. I expected to see that v16 can vectorize more loops or at least all of the loops that v14 vectorized but this is not the case. I am interested to know why this happens and also to see if anybody had some experience that over vectorization could cause slow down in the code?

I should also add that I tested the code on Intel(R) Xeon(R) CPU E3-1240 v3 @ 3.40GHz processors. The overall performance of v16 was better than v14. For some cases (depending upon the input options that trigger different parts of the code) the performance was improved up to 33% compared to v14.

Any other hint in this respect is greatly appreciated.

Regards,

Hossein

↧

How to use Compiler 17 from Parallel Studio

February 21, 2019, 12:44 pm

Latest and popular articles on Intel Technologies

≫ Next: Intel Parallel studio XE 2018 Update 3 Installation hangs

≪ Previous: difference in vectorization with different versions of Intel compiler

Hello,

I have installed the compiler and the mpss for the xeno phi. From this post: https://software.intel.com/en-us/forums/intel-many-integrated-core/topic... it says I need to use compiler 17. Is that availbe in the most recent Parallel Studio Verison I installed?

If so, where is it located/how do I get access to icpc?

If not, how do I get the correct compiler?

Thanks!

↧

Intel Parallel studio XE 2018 Update 3 Installation hangs

February 25, 2019, 11:34 pm

Latest and popular articles on Intel Technologies

≫ Next: "Intel Advisor cannot show source code for this location"

≪ Previous: How to use Compiler 17 from Parallel Studio

Hi,

I am trying to install Intel Parallel studio XE 2018 Update 3 but after providing all details and license files etc the installation hangs on "installing Component 1". I have visual studio 2017 Installed on my machine. Checking the logs I have found that the installation hangs at below point.

[MSI processing]: INFO: request to install msi: c:\Windows\Intel\parallel_studio_xe_2018_update3_professional_edition_for_cpp_setup\installs\parallel_studio_xe\054\studio_common_p_18.0.3.054\studio_common_p_18.0.3.054.msi

The above msi is present and I can install it from command prompt.

any help?

Thanks.

↧

"Intel Advisor cannot show source code for this location"

February 22, 2019, 6:22 pm

Latest and popular articles on Intel Technologies

≫ Next: OpenMP parallelization and compiler options

≪ Previous: Intel Parallel studio XE 2018 Update 3 Installation hangs

Hello,

I recently started using Intel Compiler 2019 and am learning about the compiler. After having taken a few tutorials, I started checking to see how I improve my C++ program. Unlike the tutorial, however, Intel Advisor shows "Intel Advisor cannot show source code for this location. Make sure that the Source Search locations in the Project Properties dialog contain correct location(s) of your application's source files."

This website (https://software.intel.com/en-us/advisor-user-guide-troubleshooting-sour...) shows that this message occurs when calling into third-party library routine code. I do use three third-party libraries, namely, Eigen, OpenCV, and Boost in my program. Among these, I'm not too concerned about OpenCV and Boost. But I'd like to see how I can improve my source code with Eigen.

Is there anyway to display the specific code that I should pay attention to, even with Eigen?

The environment for programming is as follows:

OS: Windows 10

IDE: Visual Studio 2015

Intel Compiler: 2019 (Evaluation version)

Regards,

↧

OpenMP parallelization and compiler options

February 25, 2019, 12:57 am

Latest and popular articles on Intel Technologies

≫ Next: -qopt-malloc-option=(n) not being recognized in Intel Compiler 19.0

≪ Previous: "Intel Advisor cannot show source code for this location"

Fellow code developers,

I've got several years of experience in code parallelization with mpi. Recently I begin to use OpenMP, and I quickly get a lot of problems. Right now the most troubling one is the intel compiler's owen optimization and the my hand-written OpenMP parallelization. Allow me to demonstrate the problem with a simple case:

Suppose we have two functions. Each function loops over a vector and there is no data dependency between these two functions. Now I use openmp to create two threads and let each thread handle one of these two functions. In theory, we should observe that the wall time for the two-thread version is half compared to the one-thread version. In my experiment, this claim is true only when the compiler optimization flag is set to be -O0. If the flag is -O1，it is not valid anymore.

If anyone can offer some insight of the problem, it will be greatly appreciated.

This is the test code:

main.cpp

#include <iostream>

#include <omp.h>

#include <vector>

#include <stdio.h>

#include <chrono>

#include "tools.h"

#define N 60000000

using namespace std;

using namespace chrono;

void func(int i, vector<vector<int> > &data) {

for (int j=1; j<N; ++j) {

data[i][j] = data[i][j-1] + data[i][j];

}

int main(int argc, char *argv[]) {

vector<vector<int> > data(24, vector<int>(N, 1));

string hostName, Ip;

if (GetHostInfo(hostName, Ip)) {

}

cout << "hostname: "<< hostName << ", ip: "<< Ip << endl;

auto start = system_clock::now();

#pragma omp parallel shared(data)

{

#pragma omp sections

{

#pragma omp section

func(0, data);

#pragma omp section

func(1, data);

#pragma omp section

func(2, data);

#pragma omp section

func(3, data);

#pragma omp section

func(4, data);

#pragma omp section

func(5, data);

#pragma omp section

func(6, data);

#pragma omp section

func(7, data);

#pragma omp section

func(8, data);

#pragma omp section

func(9, data);

#pragma omp section

func(10, data);

#pragma omp section

func(11, data);

#pragma omp section

func(12, data);

#pragma omp section

func(13, data);

#pragma omp section

func(14, data);

#pragma omp section

func(15, data);

#pragma omp section

func(16, data);

#pragma omp section

func(17, data);

#pragma omp section

func(18, data);

#pragma omp section

func(19, data);

#pragma omp section

func(20, data);

#pragma omp section

func(21, data);

#pragma omp section

func(22, data);

#pragma omp section

func(23, data);

}

auto end = system_clock::now();

auto duration = duration_cast<microseconds>(end-start);

cout << "time: "<< double(duration.count()) * microseconds::period::num / microseconds::period::den << "s\n";

return 0;

}

tools.h

#include <iostream> /* cout */

#include <unistd.h>/* gethostname */

#include <netdb.h> /* struct hostent */

#include <arpa/inet.h> /* inet_ntop */

#include <stdlib.h> /* system */

bool GetHostInfo(std::string& hostName, std::string& Ip) {

char name[256];

gethostname(name, sizeof(name));

hostName = name;

struct hostent* host = gethostbyname(name);

char ipStr[32];

const char* ret = inet_ntop(host->h_addrtype, host->h_addr_list[0], ipStr, sizeof(ipStr));

if (NULL==ret) {

std::cout << "hostname transform to ip failed";

return false;

}

Ip = ipStr;

return true;

}

int main(int argc, char *argv[]) {

std::string hostName;

std::string Ip;

bool ret = GetHostInfo(hostName, Ip);

if (true == ret) {

std::cout << "hostname: "<< hostName << std::endl;

std::cout << "Ip: "<< Ip << std::endl;

}

system("cat /proc/cpuinfo | grep 'core id'");

return 0;

}

↧

-qopt-malloc-option=(n) not being recognized in Intel Compiler 19.0

February 25, 2019, 4:45 am

Latest and popular articles on Intel Technologies

≫ Next: Integration with MS VisualStudio

≪ Previous: OpenMP parallelization and compiler options

System: macOS 10.14 (Mojave)

Chip Family: Skylake, Core i7

Intel Compiler Version: v19.0

I'm having an issue compiling a program using the -qpot-malloc-options flag icpc is stating that it does not recognize it, and so ignoring it. The program contains quite a few malloc statements (with the required headers) and so an alternative malloc algorithm would help improve performance. The code compiles successfully with other flags not ignored, such as -qopt-prefetch.

The compile command that includes the malloc is given as follows:

icpc -std=c++17 -O3 -qopt-malloc-options=4 kdtreetest kdtreetest.cpp kdtree.cpp

Hopefully I'm just missing something small...!

Appreciate the help in advance.

↧

Integration with MS VisualStudio

February 27, 2019, 9:23 am

Latest and popular articles on Intel Technologies

≫ Next: C++ compiler 19.0.2.187 crashes with provided reproducer

≪ Previous: -qopt-malloc-option=(n) not being recognized in Intel Compiler 19.0

I have MS VisualStudio version 15.9.7 installed on my Windows computer. Intel C++/Fortran version 19, update 2 doesn't seem to recognize it. Is this a known issue? If so, what do I do?

↧

C++ compiler 19.0.2.187 crashes with provided reproducer

February 27, 2019, 2:01 pm

Latest and popular articles on Intel Technologies

≫ Next: icpc program fails with double free when it shouldn't

≪ Previous: Integration with MS VisualStudio

Using icpc (ICC) 19.0.2.187 20190117, when trying to compile the following code:

struct a {
~a();
};
struct b {
a c;
a d;
};
struct e {
b f[4];
};
class g {
e h{};
};
void i() { g(); }

the compiler crashes with following error message:
": internal error: ** The compiler has encountered an unexpected problem.
** Segmentation violation signal raised. **
Access violation or stack overflow. Please contact Intel Support for assistance.

icpc: error #10105: /opt/intel/compilers_and_libraries_2019.2.187/linux/bin/intel64/mcpcom: core dumped
icpc: warning #10102: unknown signal(2039514896)
icpc: error #10106: Fatal error in /opt/intel/compilers_and_libraries_2019.2.187/linux/bin/intel64/mcpcom, terminated by unknown

The provided example compiles fine with GCC, Clang and MSVC: https://godbolt.org/z/H5w7CK

↧

icpc program fails with double free when it shouldn't

March 3, 2019, 6:32 am

Latest and popular articles on Intel Technologies

≫ Next: 32 bit intel c++ compiler

≪ Previous: C++ compiler 19.0.2.187 crashes with provided reproducer

The following program fails with " double free or corruption" error. I believe it is a compiler bug. Could someone please have a look at it.

icpc version 14.0.1 (gcc version 4.7.0 compatibility)

icpc -std=c++11 -O2 -g streamtest.cpp

#include <iostream>
#include <stdio.h>
#include <sstream>

struct Path
{
   Path()
   {
      _host = "host";
      _dir =  "dir";
      _file = "file";
   }

   std::string host()
   {
      return _host;
   }

   std::string path() {

      std::string ret = host() + _dir + _file;
      return  ret;
   }

   private:
      std::string _dir;
      std::string _file;
      std::string _host;
};

int main () {

  Path path;
  std::string p = path.path();

  return 0;

}

↧

32 bit intel c++ compiler

March 2, 2019, 4:32 am

Latest and popular articles on Intel Technologies

≫ Next: use OpenMP in Xcode

≪ Previous: icpc program fails with double free when it shouldn't

How do I download an older version of the c++ comiler for duo core processers.

I have a 32 bit operating system. windows 10 32 bit

↧

use OpenMP in Xcode

March 5, 2019, 12:55 am

Latest and popular articles on Intel Technologies

≫ Next: icl.exe segfaults when compiling certain C99 code

≪ Previous: 32 bit intel c++ compiler

i want to use OpenMP lib in Xcode with intel compiler, but when i use "#include <omp.h>" it show me "'omp.h' file not found".

↧

icl.exe segfaults when compiling certain C99 code

March 8, 2019, 6:16 am

Latest and popular articles on Intel Technologies

≫ Next: OpenMP parallel loop with ordered section is slow

≪ Previous: use OpenMP in Xcode

It appears that the following C99 program will reliably segfault the Intel C/C++ compiler on Windows:

struct s {
  void *wibble;
  void *wobble;
};
static struct s my_struct;

int main(void)
{
  static int x;
  my_struct = (struct s){ &x };
}

I've reproduced this on multiple versions of the Intel compiler, most recently version 19 update 3. Running icl.exe produces:

Intel(R) C++ Intel(R) 64 Compiler for applications running on IA-32, Version 19.0.3.203 Build 20190206 (13:50)

To reproduce the bug, compile the above program with:

icl /Qstd=c99 main.c

The output is:

icl /Qstd=c99 main.c
Intel(R) C++ Intel(R) 64 Compiler for applications running on IA-32, Version 19.0.3.203 Build 20190206
Copyright (C) 1985-2019 Intel Corporation.  All rights reserved.

main.c
": internal error: ** The compiler has encountered an unexpected problem.
** Segmentation violation signal raised. **
Access violation or stack overflow. Please contact Intel Support for assistance.

compilation aborted for main.c (code 4)

This bug also seems to occur if the target is 64-bit, and I've also reproduced the bug in version 18 update 3. From testing the various Intel compiler versions available on https://godbolt.org/, it seems that the code compiles on version 13.0.1, but fails on 16.0.3, 17.0.0, 18.0.0, 19.0.0, and 19.0.1.

From briefly experimenting with this program, it seems that the crash only occurs if x is declared static here, and it is also important that there is a member following 'wibble', although it need not be a pointer. It does not seem to matter what the lvalue is for the struct assignment (it can be a dereferenced pointer, for example).

↧