CUDA en Macbook Pro

  1. Asegúrate de tener hardware compatible con CUDA.
  2. Asegúrate de tener Mac OS X 10.5.6 o superiro (yo tengo 10.6.5).
  3. Instalar Xcode (yo instale xcode_3.2.5_and_ios_sdk_4.2_final.dmg).
  4. Instalar el controlador CUDA (yo baje devdriver_3.2.17_macos.dmg)
  5. Instalar CUDA Toolkit (cudatoolkit_3.2.17_macos.pkg)
  6. Instalar GPU Computing SDK (gpucomputingsdk_3.2.17_macos.pkg).
  7. Escribe en el archivo ~/.bash_profile
    Paris-Mac:~ paris$ vi .bash_profile
    
    export PATH=/usr/local/cuda/bin:$PATH
    export DYLD_LIBRARY_PATH=/usr/local/cuda/lib:$DYLD_LIBRARY_PATH
  8. Si tenias abierta una terminal ciérrala y abre una nueva y escribe kextstat | grep -i cuda para verificar que el controlador esta cargado en el kernel.
    Paris-Mac:~ paris$ kextstat | grep -i cuda
      131    0 0x1a1a000  0x2000     0x1000     com.nvidia.CUDA (1.1.0) <4 1>
    
  9. Luego confirma que el compilador CUDA esta instalado de la siguiente forma:
    Paris-Mac:~ paris$ nvcc -V
    nvcc: NVIDIA (R) Cuda compiler driver
    Copyright (c) 2005-2010 NVIDIA Corporation
    Built on Thu_Nov_11_15:26:50_PST_2010
    Cuda compilation tools, release 3.2, V0.2.1221
  10. Compilar los ejemplos de GPU Computing SDK:
    Paris-Mac:~ paris$ cd /Developer/GPU\ Computing/C
    Paris-Mac:C paris$ make
    [puedes esperar un rato a que termine de compilar y la salida va a dar a /Developer/GPU Computing/C/bin/darwin/release]
    
  11. Prueba que las cosas salieron bien
    Paris-Mac:~ paris$ cd /Developer/GPU\ Computing/C/bin/darwin/release
    Paris-Mac:release paris$ ./deviceQuery
    ./deviceQuery Starting...
    
     CUDA Device Query (Runtime API) version (CUDART static linking)
    
    There is 1 device supporting CUDA
    
    Device 0: "GeForce GT 330M"
      CUDA Driver Version:                           3.20
      CUDA Runtime Version:                          3.20
      CUDA Capability Major/Minor version number:    1.2
      Total amount of global memory:                 536543232 bytes
      Multiprocessors x Cores/MP = Cores:            6 (MP) x 8 (Cores/MP) = 48 (Cores)
      Total amount of constant memory:               65536 bytes
      Total amount of shared memory per block:       16384 bytes
      Total number of registers available per block: 16384
      Warp size:                                     32
      Maximum number of threads per block:           512
      Maximum sizes of each dimension of a block:    512 x 512 x 64
      Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
      Maximum memory pitch:                          2147483647 bytes
      Texture alignment:                             256 bytes
      Clock rate:                                    1.10 GHz
      Concurrent copy and execution:                 Yes
      Run time limit on kernels:                     Yes
      Integrated:                                    No
      Support host page-locked memory mapping:       Yes
      Compute mode:                                  Default (multiple host threads can use this device simultaneously)
      Concurrent kernel execution:                   No
      Device has ECC support enabled:                No
      Device is using TCC driver mode:               No
    
    deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 3.20, CUDA Runtime Version = 3.20, NumDevs = 1, Device = GeForce GT 330M
    
    
    PASSED
    
    Press  to Quit...
    -----------------------------------------------------------
    
    

    Y si pasa bien la prueba ya estas listo para compilar en CUDA

Fuente: CUDA Getting Started Mac