oneAPI and SIMD Instructions are a natural fit for database acceleration on Intel FPGAs
Single Instruction Multiple Data (SIMD) is a cutting-edge technique for enhancing the computational performance of single-threaded tasks on modern CPUs. FPGAs are renowned for delivering high-performance computing by tailoring circuits to specific algorithms. They provide a customized and optimized hardware solution, which can significantly accelerate complex computations.4.6KViews1like0CommentsIEEE FPGA Workshops and Tutorials featuring Intel speakers, May 9-13
The virtual 29th IEEE International Symposium On Field-Programmable Custom Computing Machines (FCCM) takes place online from May 9-12. Intel speakers will be presenting several workshops and tutorials at this conference including: Intel FPGA Cloud Services and Remote Learning – Workshop – May 9 AI Optimized Intel® Stratix® 10 NX FPGA – Tutorial – May 12 Using Intel® oneAPI Toolkits with FPGAs – Workshop – May 12 For more information about these IEEE FCCM workshops and tutorials including registration details, click here. Notices and Disclaimers Intel technologies may require enabled hardware, software or service activation. No product or component can be absolutely secure. Your costs and results may vary. © Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.1.8KViews0likes0CommentsParallel computing expert James Reinders says that XPUs are the Future of Compute
James Reinders, a 27-year Intel alum who recently rejoined Intel after a four-year stint as a parallel computing consultant and expert, recently wrote and published a comprehensive book about Data Parallel C++ (DPC++) titled “Data Parallel C++: Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL” (see “Springer and Intel publish new book on DPC++ parallel programming, and you can get a free PDF copy!”). DPC++ allows software developers to create code using a “single-source” writing style that can then generate parallel run-time code for heterogeneous processors including CPUs, GP-GPUS, FPGAs, and other hardware accelerators. As a class, Intel calls these processors “XPUs.” Now, Reinders has published an article titled “Heterogeneous Processing Requires Data Parallelization: SYCL and DPC++ are a Good Start” that provides a quick introduction to the XPU concept and looks at the future of heterogeneous parallel programming. In this article, Reinders writes: “SYCL and DPC++ will help us make effective use of XPUs. They are part of a broader push for support of XPUs that extends into libraries and all software development tools, building on the ambitions of SYCL and its compilers.” He continues: “That is the origin of the oneAPI industry initiative, which I’m really passionate about and was excited to be a part of as I rejoined Intel.” Later, in the article’s conclusion, Reinders writes: “I hope you’ll take the opportunity to get educated about SYCL, DPC++ and oneAPI because XPUs are the future of compute.” If you want to understand what has gotten Reinders so excited about XPUs, DPC++, and the oneAPI initiative, then give his article a read. Notices and Disclaimers Intel technologies may require enabled hardware, software or service activation. No product or component can be absolutely secure. Your costs and results may vary. © Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.1.6KViews0likes0CommentsPlease join Intel for the oneAPI Developer Summit on April 26, 2021. It’s free!
Join Intel for the oneAPI DevSummit, part of the International Workshop on OpenCL (IWOCL) and focused on using Intel® oneAPI and Data Parallel C++ (DPC++) for accelerated computing across xPU architectures including CPUs, GPUs, FPGAs, and other accelerators. During this one-day LIVE virtual conference, you will learn from leading industry and academia speakers who are working on innovative solutions involving cross-platform, multi-vendor architectures. If you’re working with FPGAs, there are two presentations of special note for you during the Developer Summit: Comparative Analysis of Intel HLS Design Tools on a Case Study in Neuromorphic: Neuromorphic object-classification algorithms have lower memory and compute complexities than CNNs at similar accuracies, which can improve the scalability of ML apps. This presentation explores the efficacy of HLS design tools, including Intel® SDK For OpenCL™ Applications for FPGAs and Intel oneAPI DPC++ for Intel® FPGAs, in terms of design latency and hardware resources. The talk includes case study details featuring a novel, neuromorphic ML algorithm. It’s Acceleration but Faster! A Business Perspective on FPGA Development: This talk explores the balance between time-to-market and performance optimization of application development using FPGAs. Informed by Intel partner for Creative Solutions Space Ltd’s journey from RTL to OpenCL to Intel’s OneAPI platform, the discussion focusses on real world examples and the advantages of using agile approaches to FPGA development. Interested? Want to attend this free Developer Summit? Then register here. Need more info? Click here. For more information about Intel DPC++, see “Springer and Intel publish new book on DPC++ parallel programming, and you can get a free PDF copy!” Notices and Disclaimers Intel technologies may require enabled hardware, software or service activation. No product or component can be absolutely secure. Your costs and results may vary. © Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.1.1KViews0likes0CommentsFree Webinar: Using Intel® oneAPI™ to Achieve High-Performance Compute Acceleration with FPGAs
Interested in learning how to use the Intel® oneAPI™ open, unified programming model to accelerate data-centric workloads using FPGAs? Then sign up for the free Webinar titled “Using Intel oneAPI to Achieve High-Performance Compute Acceleration with FPGAs.” It’s being presented on March 23 and March 25 by Intel and Bittware. The Webinar will zero in a real-world, 2D FFT workload accelerated by BittWare's 520N-MX PCIe acceleration card based on the Intel® Stratix® 10 MX FPGA. This Intel FPGA incorporates high-performance HBM2 memory, which delivers additional acceleration speed. The Webinar will discuss: How the Intel oneAPI unified programming model enables easier, software-like FPGA workload acceleration development A look at BittWare's accelerated 2D FFT code A discussion of various development tools including the Intel® Vtune™ Profiler, which optimizes application performance, system performance, and system configuration for HPC, cloud, IoT, media, storage, and more A preview of next-generation acceleration cards like BittWare's IA-840F, which is based on the Intel® Agilex™ FPGA. A live Q&A with the Webinar’s four panelists from Intel and Bittware Intrigued? Register for either Webinar by clicking the links below: Tuesday March 23: 8:30am UK/4:30am Eastern (time targets EU/Asia) Thursday March 25: 6:30pm UK/2:30pm Eastern (time targets Americas) Note: If you can't make the live event, register anyway and you will get access to the on-demand Webinar recording. Notices & Disclaimers Intel technologies may require enabled hardware, software or service activation. No product or component can be absolutely secure. Your costs and results may vary. Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy. © Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.1.2KViews0likes0CommentsFree Webinar: Accelerate FPGA Programming using Data Parallel C++ (DPC++) and the Intel® oneAPI Base Toolkit with the Intel® FPGA Add-on
Are you curious about using Intel® oneAPI and Data Parallel C++ (DPC++) to develop FPGA accelerators? Sign up for this free Webinar that will show you how to use two free tools – the Intel® oneAPI Base Toolkit (Base Kit) and the Intel® FPGA Add-on for the oneAPI Base Toolkit – to speed programming of Intel® FPGAs. Webinar topics include: How the Intel oneAPI Base Kit enables functional verification through quick emulation and rapid performance tuning How to develop a Hough Transform algorithm, a feature extraction method used in computer vision applications, using DPC++ The steps necessary to generate FPGA binaries Using the Intel FPGA Add-on tool to run a pre-compiled bitstream to observe the algorithm's performance on real FPGA accelerator hardware Register for the Webinar here. For more information about Intel oneAPI and DPC++, see “Springer and Intel publish new book on DPC++ parallel programming, and you can get a free PDF copy!” Notices & Disclaimers Intel technologies may require enabled hardware, software or service activation. No product or component can be absolutely secure. Your costs and results may vary. © Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.947Views0likes0CommentsNextPlatform.com article describes Intel® oneAPI use at CERN for Large Hadron Collider (LHC) research
Independent consultant James Reinders has just published a comprehensive article on the NextPlatform.com Web site titled “CERN uses [Intel®] DL Boost, oneAPI to juice inference without accuracy loss,” which describes the use of deep learning and Intel® oneAPI by CERN to accelerate Monte Carlo simulations for Large Hadron Collider (LHC) research. Reinders writes that CERN researchers “have demonstrated success in accelerating inferencing nearly two-fold by using reduced precision without compromising accuracy at all.” The work is being carried out as part of Intel’s long-standing collaboration with CERN through CERN openlab. If Reinders’ name looks familiar to you, that’s because he recently published a book about the use of Data Parallel C++ (DPC++), which is the foundation compiler technology at the heart of Intel oneAPI. (See “Springer and Intel publish new book on DPC++ parallel programming, and you can get a free PDF copy!”) CERN researchers found that about half of the computations in a specific neural network (NN) called a Generative Adversarial Network (GAN) could be switched from FP32 to INT8 numerical precision, which is directly supported by Intel® DL Boost, without loss of accuracy. GAN performance doubled as a result while accuracy was not affected. Although this work was done using Intel® Xeon® Scalable Processors with direct INT8 support, Reinders’ article also makes the next logical jump: “INT8 has broad support thanks to Intel Xeon [Scalable Processors], and it is also supported in Intel® Xe GPUs. FPGAs can certainly support INT8 and other reduced precision formats.” Further, writes Reinders: “The secret sauce underlying this work and making it even better: oneAPI makes Intel DL Boost and other acceleration easily available without locking in applications to a single vendor or device” “It is worth mentioning how oneAPI adds value to this type of work. Key parts of the tools used, including the acceleration tucked inside TensorFlow and Python, utilize libraries with oneAPI support. That means they are openly ready for heterogeneous systems instead of being specific to only one vendor or one product (e.g. GPU). “oneAPI is a cross-industry, open, standards-based unified programming model that delivers a common developer experience across accelerator architectures. Intel helped create oneAPI, and supports it with a range of open source compilers, libraries, and other tools. By programming to use INT8 via oneAPI, the kind of work done at CERN described in this article could be carried out using Intel Xe GPUs, FPGAs, or any other device supporting INT8 or other numerical formats for which they may quantize.” For additional information about Intel oneAPI, see “Release beta09 of Intel® oneAPI Products Now Live – with new programming tools for FPGA acceleration including Intel® VTune™ Profiler.” You may also be interested in an instructor-led class titled “Using Intel® oneAPI Toolkits with FPGAs (IONEAPI).” Notices & Disclaimers Performance varies by use, configuration, and other factors. Learn more at www.Intel.com/PerformanceIndex. Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy. Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available updates. See backup for configuration details. No product or component can be absolutely secure. Your costs and results may vary. Intel technologies may require enabled hardware, software or service activation. © Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others. 1.5KViews0likes0CommentsBiz Tech thought leader Dez Blanchfield’s recent podcast discusses infrastructure and application acceleration using FPGAs, SmartNICs, and Intel® eASIC™ structured ASICs with Intel’s Jim Dworkin
Business and digital transformation strategy thought leader Dez Blanchfield regularly posts podcasts in a series titled “Conversations with Dez” about topics of interest to the technology industry. He recently interviewed Jim Dworkin, Senior Director of the Cloud Business Unit in the Programmable Solutions Group at Intel and the two have a wide-ranging discussion that touches on data center architecture and making FPGAs easier to use to accelerate various workloads at scale within the data center. The two also discuss the roles of FPGA-based SmartNICs, which can “hard wire” data flows within the data center to build strategic pathways to reduce data movement, accelerate processing rates, speed time to market, and lower total costs of ownership. In addition, their discussion covers the broader use of Intel® FPGAs and Intel® eASIC™ structured ASICs, the new AI-optimized Intel® Stratix® 10 NX FPGA, and the use of Intel® Open FPGA Stack (Intel® OFS) and Intel® oneAPI toolkits to ease the adoption of all forms of Intel® XPU acceleration including FPGAs. The hour-long interview is now available on SoundCloud. Click here to listen. For more information about the AI-optimized Intel Stratix 10 NX FPGA, see “Intel has just announced its first AI-optimized FPGA – the Intel® Stratix® 10 NX FPGA – to address the rapid increase in AI model complexity” and “More details on the Intel® Stratix® 10 NX FPGA, the first AI-optimized Intel® FPGA, now available in a new White Paper.” For more information about Intel OFS, see “Intel® Open FPGA Stack Eases Development of Custom Acceleration Platform Solutions.” For more information about Intel oneAPI, see “Intel’s One API will allow you to write code once, then target many processing resources: CPUs, GPUs, FPGAs, AI engines.” Notices & Disclaimers Intel technologies may require enabled hardware, software or service activation. No product or component can be absolutely secure. Your costs and results may vary. © Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.1KViews0likes0CommentsGold Release of Intel® oneAPI Toolkits arrive in December: One Programming Model for a Heterogeneous World of CPUs, GPUs, and FPGAs
This week at the Supercomputing 2020 (SC20) conference, Intel announced that the gold release of the Intel® oneAPI toolkits will become available next month. The oneAPI industry initiative is creating a unified and simplified cross-architecture programming model that delivers uncompromised performance without proprietary lock-in, while allowing you to integrate legacy code. Intel oneAPI Toolkits allow you to create code for CPUs and XPUs (the term “XPU” means “other processing units”) – such as GPUs based on the Intel® Xe architecture and Intel® FPGAs including Intel® Arria® and Intel® Stratix® FPGAs – within a unified programming environment. With oneAPI, you choose the best processing architecture for the specific problem you’re solving without needing to rewrite software. Intel oneAPI toolkits take full advantage of cutting-edge hardware capabilities and instructions built into Intel® CPUs including Intel® AVX-512 SIMD instruction extensions and Intel® DL Boost, along with features unique to Intel® XPUs. Built on long-standing and proven Intel developer tools, Intel oneAPI toolkits support familiar languages and software standards while providing full continuity with existing code. The gold release of Intel oneAPI toolkits will start shipping in December. They will be available for free, to run locally and in the Intel® DevCloud. Commercial versions that include worldwide support from Intel technical consulting engineers will also be offered. For more information about this and other Intel SC20 announcements, click here. Notices & Disclaimers Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex. Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available updates. See backup for configuration details. No product or component can be absolutely secure. Your costs and results may vary. Intel technologies may require enabled hardware, software or service activation. © Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others. 1.4KViews0likes0CommentsSpringer and Intel publish new book on DPC++ parallel programming, and you can get a free PDF copy!
Data Parallel C++ (DPC++) is an open-source compiler project based on the Khronos SYCL compiler with a few extensions. It is also the foundation compiler technology for oneAPI, a cross-industry, open, standards-based unified programming model that delivers a common developer experience across accelerator architectures. SYCL is an industry-driven Khronos programming language standard that adds data parallelism to the C++ language with support for heterogeneous computing architectures. The DPC++ language also offers broad, heterogeneous support for CPUs, GPUs, and FPGAs – which is why it’s the compiler at the core of the oneAPI specification. Springer has just published new book titled “Data Parallel C++: Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL” under its Apress open-access imprint. The book’s authors include James Reinders, a consultant with more than three decades of parallel computing experience, and five additional authors from Intel. A printed version of the book is now on sale online and a free PDF version is available on the SpringerLink Web site under the Creative Commons Attribution 4.0 International License. This new 548-page book covers programming for data parallelism using C++ in depth. All examples in the book starting with the “Hello Data-Parallel Programming World” example in Chapter 1 compile and work with DPC++ compilers and are available from a GitHub repository. The book is for all software developers, whether they’re new to parallel programming or old hands at it. As the authors write in the book’s preface: “If you are new to parallel programming, that is okay. If you have never heard of SYCL or the DPC++ compiler, that is also okay.” For readers of this Programmable Logic blog, Chapter 17 titled “Programming for FPGAs” will be especially interesting. As the authors explain in Chapter 17’s second paragraph: “Field Programmable Gate Arrays (FPGAs) are unfamiliar to the majority of software developers, in part because most desktop computers don’t include an FPGA alongside the typical CPU and GPU. But FPGAs are worth knowing about because they offer advantages in many applications. The same questions need to be asked as we would of other accelerators, such as “When should I use an FPGA?”, “What parts of my applications should be offloaded to FPGA?”, and “How do I write code that performs well on an FPGA? “This chapter gives us the knowledge to start answering those questions…” For more information about “Data Parallel C++: Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL,” and to download a free PDF copy of the book, click here. Notices & Disclaimers Intel technologies may require enabled hardware, software or service activation. No product or component can be absolutely secure. Your costs and results may vary. © Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.6.1KViews0likes0Comments