Skip to content

Problems with NVIDIA Benchmarks #98

@yl-jiang

Description

@yl-jiang

Environment:

  1. GPU cards: Tesla K80
  2. CUDA:8.0
  3. cuDNN:5.1
  4. OpenMPI:1.10.2

Problems:

After make there are five files in .../nvidia/bin , they are:

conv_bench gemm_bench nccl_mpi_all_reduce nccl_single_all_reduce rnn_bench

And I can successfully run 'rnn_bench', 'nccl_single_all_reduce',

  1. But when I run 'gemm_bench' it give me the error of "terminate called after throwing an instance of 'std::runtime_error'";
  2. run 'conv_bench' it will be stop when procedure doing the 11th test,and the error is " terminate called after throwing an instance of 'std::runtime_error' what(): Illegal algorithm passed to get_fwd_algo_string. Algo: 7"
  3. run 'nccl_mpi_all_reduce' the error is "terminate called after throwing an instance of 'std::runtime_error'what(): NCCL failure: invalid device pointer in nccl_mpi_all_reduce.cu at line: 86 rank: 0"

How can I fix it?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions