GPGPU/NVIDIA CUDA


NVIDIA CUDA Research Center

10/6/2010: NC State University has become an NVIDIA CUDA Research Center.

Link to NVIDIA CUDA Research Center Page (Press Release)

NVIDIA CUDA Teaching Center

12/2/2010: NC State University has become an NVIDIA CUDA Teaching Center.

Link to NVIDIA CUDA Teaching Center Page


Hardware (donated by NVIDIA)

1 GeForce 8800 GTX (retired)

  • Stream Processors 128
  • Core Clock (MHz) 575
  • Shader Clock (MHz) 1350
  • Memory Clock (MHz) 900
  • Memory Amount 768MB
  • Memory Interface 384-bit
  • Memory Bandwidth (GB/sec) 114
  • Texture Fill Rate (billion/sec) 36.8
  • full specs

2 GeForce 9800 GX2 (retired)

  • Stream Processors 256
  • Core Clock (MHz) 675
  • Shader Clock (MHz) 1688
  • Memory Clock (MHz) 2200
  • Memory Amount 1024MB
  • Memory Interface 512-bit
  • Memory Bandwidth (GB/sec) 128
  • Texture Fill Rate (billion/sec) 76.8
  • full specs

18 GeForce GTX 280 (on os40..os57)

  • Stream Processors 240
  • Core Clock (MHz) 602
  • Shader Clock (MHz) 1296
  • Memory Clock (MHz) 2214
  • Memory Amount 1024MB
  • Memory Interface 512-bit
  • Memory Bandwidth (GB/sec) 142
  • Texture Fill Rate (billion/sec) 48.2
  • full specs

1 Tesla C1060 (retired)

  • Stream Processors 240
  • Core Clock (MHz) 602
  • Shader Clock (MHz) 1296
  • Memory Clock (MHz) 1600
  • Memory Amount 4GB
  • Memory Interface 512-bit
  • Memory Bandwidth (GB/sec) 102
  • Texture Fill Rate (billion/sec) ?
  • full specs

9 Tesla C2050 (on arc cluster)

  • Stream Processors 448
  • Core Clock (MHz) 575
  • Shader Clock (MHz) 1150
  • Memory Clock (MHz) 3000
  • Memory Amount 3GB
  • Memory Interface 384-bit
  • Memory Bandwidth (GB/sec) 144
  • Texture Fill Rate (billion/sec) ?
  • full specs

78 GTX 480 (on arc cluster)

  • Stream Processors 480
  • Core Clock (MHz) 700
  • Shader Clock (MHz) 1401
  • Memory Clock (MHz) 3696
  • Memory Amount 1.5GB
  • Memory Interface 384-bit
  • Memory Bandwidth (GB/sec) 177
  • Texture Fill Rate (billion/sec) ?
  • full specs

1 GTX 680 (on arc cluster)

  • Stream Processors 1536
  • Core Clock (MHz) 1006
  • Shader Clock (MHz) 1058
  • Memory Clock (MHz) 6008
  • Memory Amount 2.0GB
  • Memory Interface 256-bit
  • Memory Bandwidth (GB/sec) 192
  • Texture Fill Rate (billion/sec) ?
  • full specs

5 GTX 780 (on arc cluster)

  • Stream Processors 2304
  • Core Clock (MHz) 863
  • Shader Clock (MHz) 900
  • Memory Clock (MHz) 6008
  • Memory Amount 3.0GB
  • Memory Interface 384-bit
  • Memory Bandwidth (GB/sec) 288
  • Texture Fill Rate (billion/sec) ?
  • full specs

3 Tesla K20c (on arc cluster)

  • Stream Processors 2496
  • Core Clock (MHz) 705
  • Shader Clock (MHz) N/A
  • Memory Clock (MHz) 5200
  • Memory Amount 5.0GB
  • Memory Interface 320-bit
  • Memory Bandwidth (GB/sec) 200
  • Texture Fill Rate (billion/sec) ?
  • full specs

1 Tesla K40c (on arc cluster)

  • Stream Processors 2880
  • Core Clock (MHz) 875
  • Shader Clock (MHz) N/A
  • Memory Clock (MHz) 3004
  • Memory Amount 12GB
  • Memory Interface 384-bit
  • Memory Bandwidth (GB/sec) 288
  • Texture Fill Rate (billion/sec) ?
  • full specs

3 GeForce GTX Titan X (on arc cluster)

  • Stream Processors 3072
  • Core Clock (MHz) 1000
  • Shader Clock (MHz) N/A
  • Memory Clock (MHz) 7010
  • Memory Amount 12GB
  • Memory Interface 384-bit
  • Memory Bandwidth (GB/sec) 336
  • Texture Fill Rate (billion/sec) ?
  • full specs

2 GeForce GTX 1080 (on arc cluster)

  • Stream Processors 2560
  • Core Clock (MHz) 1607
  • Shader Clock (MHz) N/A
  • Memory Clock (MHz) 10,000
  • Memory Amount 8GB
  • Memory Interface 256-bit
  • Memory Bandwidth (GB/sec) 320
  • Texture Fill Rate (billion/sec) ?
  • full specs

1 GeForce GTX Titan X 10 series (on arc cluster)

  • Stream Processors 3584
  • Core Clock (MHz) 1417
  • Shader Clock (MHz) N/A
  • Memory Clock (MHz) 10,000
  • Memory Amount 12GB
  • Memory Interface 384-bit
  • Memory Bandwidth (GB/sec) 480
  • Texture Fill Rate (billion/sec) ?
  • full specs


Hardware (not donated by NVIDIA)


Samsung N510 under Fedora 12 with CUDA

Software

All software is 32 bit unless marked otherwise.


Access

  • Request a user id from Frank Mueller (workstation "os40-os57"). Please indicate your unity ID and student ID.
    Accounts on os40-os57 have an NFS shared file space.

User Installation

  • Append to your ~/.bashrc:
    export PATH=".:~/bin:/usr/local/bin:/usr/bin:$PATH"
    export PATH="/usr/local/cuda/bin:$PATH:"
    export LD_LIBRARY_PATH="/usr/local/cuda/lib64"
    export MANPATH="/usr/share/openmpi/1.2.4-gcc/man"
    
    Log out and back in to activate the new settings.

  • for CUDA 5.0rc1:
    • Install the SDK in your directory:
      cd /usr/local
      tar czf ~/NVIDIA_GPU_Computing_SDK-50.tgz cuda-5.0/samples
      tar xzf ~/NVIDIA_GPU_Computing_SDK-50.tgz
      
    • Test the SDK:
      cd cuda-5.0/samples
      make
      ./bin/linux/release/bandwidthTest
      ./bin/linux/release/matrixMul
      
    • Tools for Developing/Debugging CUDA Programs
      • cuda-gdb (CUDA debugger)
      • nsight (Ecplise for CUDA)
  • for CUDA 4.1rc2:
    • Install the SDK in your directory:
      sh /home/root/gpucomputingsdk_4.1.21_linux.run
      
    • Test the SDK:
      cd NVIDIA_GPU_Computing_SDK/C
      #edit common/common.mk, before "# This line invokes..." add
        #ADD THIS LINE HERE
        LIB += -lpthread
      make
      ./bin/linux/release/bandwidthTest
      ./bin/linux/release/matrixMul
      
  • for CUDA 3.2:
    • Install the SDK in your directory:
      sh /home/root/gpucomputingsdk_3.2.16_linux.run
      
    • Test the SDK:
      cd NVIDIA_GPU_Computing_SDK/C
      #edit common/common.mk, before "# This line invokes..." add
        #ADD THIS LINE HERE
        LIB += -lpthread
      make
      ./bin/linux/release/bandwidthTest
      ./bin/linux/release/matrixMul
      
  • for CUDA 2.3:
    • Install and test the SDK:
      mkdir -p NVIDIA_GPU_Computing_SDK/C
      cd NVIDIA_GPU_Computing_SDK/C
      cp /usr/local/NVIDIA_GPU_Computing_SDK/C/Makefile .
      mkdir common src
      cd common
      cp /usr/local/NVIDIA_GPU_Computing_SDK/C/common/Makefile .
      cp /usr/local/NVIDIA_GPU_Computing_SDK/C/common/common.mk .
      mkdir obj
      ln -s /usr/local/NVIDIA_GPU_Computing_SDK/C/common/* .
      cd ../src
      cp -R /usr/local/NVIDIA_GPU_Computing_SDK/C/src/matrixMul .
      cp -R /usr/local/NVIDIA_GPU_Computing_SDK/C/src/bandwidthTest .
      cd ..
      make
      bin/linux/release/bandwidthTest
      bin/linux/release/matrixMul
      
  • for CUDA 2.2 or earlier:
    • Install and test the SDK:
      mkdir NVIDIA
      cd NVIDIA
      cp /usr/local/NVIDIA_CUDA_SDK/Makefile .
      mkdir common projects
      cd common
      cp /usr/local/NVIDIA_CUDA_SDK/common/Makefile .
      cp /usr/local/NVIDIA_CUDA_SDK/common/common.mk .
      ln -s /usr/local/NVIDIA_CUDA_SDK/common/* .
      cd ../projects
      cp -R /usr/local/NVIDIA_CUDA_SDK/projects/matrixMul .
      cp -R /usr/local/NVIDIA_CUDA_SDK/projects/bandwidthTest .
      cd ..
      make
      bin/linux/release/bandwidthTest
      bin/linux/release/matrixMul
      

    More Information


    References:

    Additional references: