Free Essay

Cuda Overview

In: Computers and Technology

Submitted By sagardaddysboy
Words 1261
Pages 6
NVIDIA® CUDA™ Architecture

Introduction & Overview

Version 1.1 April 2009

NVIDIA® CUDA™ Architecture Introduction & Overview

Introduction
NVIDIA® CUDA™ technology leverages the massively parallel processing power of NVIDIA GPUs. The CUDA architecture is a revolutionary parallel computing architecture that delivers the performance of NVIDIA’s world-renowned graphics processor technology to general purpose GPU Computing. Applications that run on the CUDA architecture can take advantage of an installed base of over one hundred million CUDA-enabled GPUs in desktop and notebook computers, professional workstations, and supercomputer clusters. With the CUDA architecture and tools, developers are achieving dramatic speedups in fields such as medical imaging and natural resource exploration, and creating breakthrough applications in areas such as image recognition and real-time HD video playback and encoding. CUDA enables this unprecedented performance via standard APIs such as the soon to be released OpenCL™ and DirectX® Compute, and high level programming languages such as C/C++, Fortran, Java, Python, and the Microsoft .NET Framework.

The CUDA Architecture
The CUDA Architecture consists of several components, in the green boxes below: 1. 2. 3. 4. Parallel compute engines inside NVIDIA GPUs OS kernel-level support for hardware initialization, configuration, etc. User-mode driver, which provides a device-level API for developers PTX instruction set architecture (ISA) for parallel computing kernels and functions
Device-level APIs Language Integration

Applications Using DirectX

Applications Using OpenCL

Applications Using the CUDA Driver API
C for CUDA Compute Kernels

Applications Using C, C++, Fortran, Java, Python, ...
C for CUDA Compute Functions

HLSL Compute Shaders

OpenCL C Compute Kernels

DirectX Compute

OpenCL Driver

C Runtime for CUDA

3

CUDA Driver

PTX (ISA)

4

CUDA Support in OS Kernel CUDA Parallel Compute Engines inside NVIDIA GPUs

2

1

1 www.nvidia.com/cuda

NVIDIA® CUDA™ Architecture Introduction & Overview

The CUDA Software Development Environment
The CUDA Software Development Environment provides all the tools, examples and documentation necessary to develop applications that take advantage of the CUDA architecture. Libraries C Runtime Advanced libraries that include BLAS, FFT, and other functions optimized for the CUDA architecture The C Runtime for CUDA provides support for executing standard C functions on the GPU and allows native bindings for other high-level languages such as Fortran, Java, and Python NVIDIA C Compiler (nvcc), CUDA Debugger (cudagdb), CUDA Visual Profiler (cudaprof), and other helpful tools Includes the CUDA Programming Guide, API specifications, and other helpful documentation SDK code samples and documentation that demonstrate best practices for a wide variety GPU Computing algorithms and applications

Tools Documentation Samples

The CUDA Software Development Environment supports two different programming interfaces: 1. A device-level programming interface, in which the application uses DirectX Compute, OpenCL or the CUDA Driver API directly to configure the GPU, launch compute kernels, and read back results. 2. A language integration programming interface, in which an application uses the C Runtime for CUDA and developers use a small set of extensions to indicate which compute functions should be performed on the GPU instead of the CPU. When using the device-level programming interface, developers write compute kernels in separate files using the kernel language supported by their API of choice. DirectX Compute kernels (aka “compute shaders”) are written in HLSL. OpenCL kernels are written in a C-like language called “OpenCL C”. The CUDA Driver API accepts kernels written in C or PTX assembly. When using the language integration programming interface, developers write compute functions in C and the C Runtime for CUDA automatically handles setting up the GPU and executing the compute functions. This programming interface enables developers to take advantage of native support for high-level languages such as C, C++, Fortran, Java, Python, and more (see below), reducing code complexity and development costs through type integration and code integration: • Type integration allows standard types as well as vector types and user-defined types (including structs) to be used seamlessly across functions that are executed on the CPU and functions that are executed on the GPU. Code integration allows the same function to be called from functions that will be executed on the CPU and functions that will be executed on the GPU.



2 www.nvidia.com/cuda

NVIDIA® CUDA™ Architecture Introduction & Overview

When necessary to distinguish functions that will be executed on the CPU from those that will be executed on the GPU, the term C for CUDA is used to describe the small set of extensions that allow developers to specify which functions will be executed on the GPU, how GPU memory will be used, and how the parallel processing capabilities of the GPU will be used by the application.

CUDA Adoption
First introduced in March 2007, and with over 100 million CUDA-enabled GPUs sold to date, thousands of software developers are already using the free CUDA software development tools to solve problems in a variety of professional and home applications – from video and image processing and physics simulations, to oil and gas exploration, product design, medical imaging, and scientific research. Applications written in C and C++ can use the C Runtime for CUDA directly. Applications written in other languages can access the runtime via native method bindings, and there are several projects that enable developers to use the CUDA architecture this way, including: Fortran: o Fortran wrapper for CUDA – http://www.nvidia.com/object/cuda_programming_tools.html o FLAGON Fortran 95 library for GPU Numerics – http://flagon.wiki.sourceforge.net/ o PGI Fortran to CUDA compiler – http://www.pgroup.com/resources/accel.htm Java: o JaCuda – http://jacuda.wiki.sourceforge.net o Bindings for CUDA BLAS and FFT libs – http://javagl.de/index.html Python: o PyCUDA Python wrapper – http://mathema.tician.de/software/pycuda .NET languages: o CUDA.NET – http://www.gass-ltd.co.il/en/products/cuda.net Resources for other languages: o SWIG – http://www.swig.org (generates interfaces to C/C++ for dozens of languages) Developers can take advantage of great performance on the CUDA architecture today using rich APIs and a variety of high-level languages on 32-bit and 64-bit versions of Linux, MacOS, and Windows.

For more information about GPU Computing and the CUDA architecture, please visit www.nvidia.com/cuda.

3 www.nvidia.com/cuda

Notice ALL NVIDIA DESIGN SPECIFICATIONS, REFERENCE BOARDS, FILES, DRAWINGS, DIAGNOSTICS, LISTS, AND OTHER DOCUMENTS (TOGETHER AND SEPARATELY, “MATERIALS”) ARE BEING PROVIDED “AS IS.” NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE MATERIALS, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. Information furnished is believed to be accurate and reliable. However, NVIDIA Corporation assumes no responsibility for the consequences of use of such information or for any infringement of patents or other rights of third parties that may result from its use. No license is granted by implication or otherwise under any patent or patent rights of NVIDIA Corporation. Specifications mentioned in this publication are subject to change without notice. This publication supersedes and replaces all information previously supplied. NVIDIA Corporation products are not authorized for use as critical components in life support devices or systems without express written approval of NVIDIA Corporation.

Trademarks NVIDIA, the NVIDIA logo, CUDA, and GeForce are trademarks or registered trademarks of NVIDIA Corporation. OpenCL is trademark of Apple Inc. used under license to the Khronos Group Inc. DirectX is a registered trademark of Microsoft Corporation. Other company and product names may be trademarks of the respective companies with which they are associated.

Copyright © 2009 by NVIDIA Corporation. All rights reserved.

NVIDIA Corporation 2701 San Tomas Expressway Santa Clara, CA 95050 www.nvidia.com

Similar Documents

Free Essay

Asas

...◆ 3D Rendering in the Cloud ˇ c Martin D. Carroll, Ilija Hadzi´ , and William A. Katsak Many modern applications and window systems perform three-dimensional (3D) rendering. For a cloud system to support such applications, that 3D rendering must be performed in the cloud, because the end-user equipment cannot be relied upon to contain the necessary rendering hardware. All systems that perform 3D rendering in the cloud are faced with two fundamental and related problems: 1) How to enable an arbitrary number of users to produce rendered pixel streams, and 2) how to transfer those pixel streams out of the server’s frame buffers and into one or more encoders, for transmission to the user. We have implemented a new form of display virtualization that solves both of these problems in a low-level and transparent manner. Using our display virtualization (which we call the virtual cathode ray tube controller (VCRTC)), the cloud system can support an arbitrary number of pixel streams (bounded only by memory and bandwidth resources), and it can dynamically associate those streams with encoders. VCRTCs are completely transparent to the applications: No application needs to be modified, recompiled, or even relinked to use VCRTCs. Because they are low-level and transparent, VCRTCs are also a general mechanism with utility beyond cloud systems. © 2012 Alcatel-Lucent. Introduction Three-dimensional rendering is the process of transforming a model of a three-dimensional (3D) scene...

Words: 7263 - Pages: 30

Free Essay

Science

...TARCAD: A Template Architecture for Reconfigurable Accelerator Designs Muhammad Shafiq, Miquel Peric` s a Nacho Navarro Eduard Ayguad´ e Computer Sciences Dept. Arquitectura de Computadors Computer Sciences Barcelona Supercomputing Center Universitat Polit` cnica de Catalunya Barcelona Supercomputing Center e Barcelona, Spain Barcelona, Spain Barcelona, Spain {muhammad.shafiq, miquel.pericas}@bsc.es nacho@ac.upc.edu eduard.ayguade@bsc.es Abstract—In the race towards computational efficiency, accelerators are achieving prominence. Among the different types, accelerators built using reconfigurable fabric, such as FPGAs, have a tremendous potential due to the ability to customize the hardware to the application. However, the lack of a standard design methodology hinders the adoption of such devices and makes difficult the portability and reusability across designs. In addition, generation of highly customized circuits does not integrate nicely with high level synthesis tools. In this work, we introduce TARCAD, a template architecture to design reconfigurable accelerators. TARCAD enables high customization in the data management and compute engines while retaining a programming model based on generic programming principles. The template features generality and scalable performance over a range of FPGAs. We describe the template architecture in detail and show how to implement five important scientific kernels: MxM, Acoustic Wave Equation, FFT, SpMV and Smith Waterman. TARCAD is......

Words: 7421 - Pages: 30

Premium Essay

Web Application Firewalls

...Magic Quadrant for Web Application Firewalls Page 1 sur 13 Magic Quadrant for Web Application Firewalls 17 June 2014 ID:G00259365 Analyst(s): Jeremy D'Hoinne, Adam Hils, Greg Young, Joseph Feiman VIEW SUMMARY The WAF market is growing quickly from a small base; it is composed of pure players, application delivery controller vendors, cloud service providers and network security vendors. Buyers should evaluate how WAFs can provide high security, minimize false positives and sustain performance. STRATEGIC PLANNING ASSUMPTIONS At the end of 2018, less than 20% of enterprises will rely only on firewalls or intrusion prevention systems to protect their Web applications — down from 40% today. By year-end 2020, more than 50% of public Web applications protected by a WAF will use WAFs delivered as a cloud service or Internet-hosted virtual appliance — up from less than 10% today. Market Definition/Description The Web application firewall (WAF) market is defined by a customer's need to protect internal and public Web applications when they are deployed locally (on-premises) or remotely (hosted, "cloud" or "as a service"). WAFs are deployed in front of Web servers to protect Web applications against hackers' attacks, to monitor access to Web applications, and to collect access logs for compliance/auditing and analytics. WAFs are most often deployed in-line, as a reverse proxy, because historically it was the only way to perform some in-depth inspections. Other deployment......

Words: 10448 - Pages: 42

Free Essay

Nit-Silchar B.Tech Syllabus

...NATIONAL INSTITUTE OF TECHNOLOGY SILCHAR Bachelor of Technology Programmes amï´>r¶ JH$s g§ñWmZ, m¡Úmo{ à VO o pñ Vw dZ m dY r V ‘ ñ Syllabi and Regulations for Undergraduate PROGRAMME OF STUDY (wef 2012 entry batch) Ma {gb Course Structure for B.Tech (4years, 8 Semester Course) Civil Engineering ( to be applicable from 2012 entry batch onwards) Course No CH-1101 /PH-1101 EE-1101 MA-1101 CE-1101 HS-1101 CH-1111 /PH-1111 ME-1111 Course Name Semester-1 Chemistry/Physics Basic Electrical Engineering Mathematics-I Engineering Graphics Communication Skills Chemistry/Physics Laboratory Workshop Physical Training-I NCC/NSO/NSS L 3 3 3 1 3 0 0 0 0 13 T 1 0 1 0 0 0 0 0 0 2 1 1 1 1 0 0 0 0 4 1 1 0 0 0 0 0 0 2 0 0 0 0 P 0 0 0 3 0 2 3 2 2 8 0 0 0 0 0 2 2 2 2 0 0 0 0 0 2 2 2 6 0 0 8 2 C 8 6 8 5 6 2 3 0 0 38 8 8 8 8 6 2 0 0 40 8 8 6 6 6 2 2 2 40 6 6 8 2 Course No EC-1101 CS-1101 MA-1102 ME-1101 PH-1101/ CH-1101 CS-1111 EE-1111 PH-1111/ CH-1111 Course Name Semester-2 Basic Electronics Introduction to Computing Mathematics-II Engineering Mechanics Physics/Chemistry Computing Laboratory Electrical Science Laboratory Physics/Chemistry Laboratory Physical Training –II NCC/NSO/NSS Semester-4 Structural Analysis-I Hydraulics Environmental Engg-I Structural Design-I Managerial Economics Engg. Geology Laboratory Hydraulics Laboratory Physical Training-IV NCC/NSO/NSS Semester-6 Structural Design-II Structural Analysis-III Foundation Engineering Transportation Engineering-II Hydrology......

Words: 126345 - Pages: 506

Free Essay

Deep Learning Wikipedia

...Deep Learning more at http://ml.memect.com Contents 1 Artificial neural network 1 1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2.1 Improvements since 2006 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3.1 Network function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3.2 Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3.3 Learning paradigms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3.4 Learning algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4 Employing artificial neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.5.1 Real-life applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.5.2 Neural networks and neuroscience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.6 Neural network software ...

Words: 55759 - Pages: 224

Premium Essay

Vault Guide Resumes, Cover Letters & Interviews 2003

...The media’s watching Vault! Here’s a sampling of our coverage. “For those hoping to climb the ladder of success, [Vault's] insights are priceless.” – Money magazine “The best place on the web to prepare for a job search.” – Fortune “[Vault guides] make for excellent starting points for job hunters and should be purchased by academic libraries for their career sections [and] university career centers.” – Library Journal “The granddaddy of worker sites.” – US News and World Report “A killer app.” – New York Times One of Forbes' 33 “Favorite Sites” – Forbes “To get the unvarnished scoop, check out Vault.” – Smart Money Magazine “Vault has a wealth of information about major employers and jobsearching strategies as well as comments from workers about their experiences at specific companies.” – The Washington Post “A key reference for those who want to know what it takes to get hired by a law firm and what to expect once they get there.” – New York Law Journal “Vault [provides] the skinny on working conditions at all kinds of companies from current and former employees.” – USA Today VAULT GUIDE TO RESUMES, COVER LETTERS & INTERVIEWS © 2003 Vault Inc. VAULT GUIDE TO RESUMES, COVER LETTERS & INTERVIEWS HOWARD LEIFMAN, PhD, MARCY LERNER AND THE STAFF OF VAULT © 2003 Vault Inc. Copyright © 2003 by Vault Inc. All rights reserved. All information in this book is subject to change without notice. Vault makes no claims as to the accuracy and reliability...

Words: 46382 - Pages: 186