Microsoft and Amazon announce deep learning library Gluon

Microsoft has announced a new partnership with Amazon to create a open-source deep learning library called Gluon. The idea behind Gluon is to make artificial intelligence more accessible and valuable.

According to Microsoft, the library simplifies the process of making deep learning models and will enable developers to run multiple deep learning libraries. This announcement follows their introduction of the Open Neural Network Exchange (ONNX) format, which is another AI ecosystem.

Gluon supports symbolic and imperative programming, which is something not supported by many other toolkits, Microsoft explained. It also will support hybridization of code, allowing compute graphs to be cached and reused in future iterations. It offers a layers library that reuses pre-built building blocks to define model architecture. Gluon natively supports loops and ragged tensors, allowing for high execution efficiency for RNN and LSTM models, as well as supporting sparse data and operations. It also provides the ability to do advanced scheduling on multiple GPUs.

“This is another step in fostering an open AI ecosystem to accelerate innovation and democratization of AI-making it more accessible and valuable to all,” Microsoft wrote in a blog post. “With Gluon, developers will be able to deliver new and exciting AI innovations faster by using a higher-level programming model and the tools and platforms they are most comfortable with.”

The library will be available for Apache MXNet or Microsoft Cognitive Toolkit. It is already available on GitHub for Apache MXNet, with Microsoft Cognitive Toolkit support on the way.

The Whopper Coin, Movidius Myriad X VPU, and DxEnterprise v17

Burger King has launched the Whopper Coin in Russia, which uses blockchain technology as a secure system for rewards points.

Customers will be able to scan their receipt with a smartphone and will be rewarded with 1 WhopperCoin for every rouble ($0.02) spent on a Whopper sandwich at the fastfood chain. When a user amasses 1,700 WhopperCoin (five or six burgers worth of purchases), they can redeem them for a free Whopper.

Since the crypto-currency is hosted on the Waves platform, it can be freely traded and transferred like any other.

“Eating Whoppers now is a strategy for financial prosperity tomorrow,” Burger King Russia’s head of external communications, Ivan Shestov said.

DH2i adds Linux, Docker support to high availability container solution
High availability and disaster recovery developer DH2i has launched DxEnterprise v17, adding support for Linux to the previously Windows Server-exclusive virtualization management software.

The new release adds support Docker containers for the first time, as well as updated support for SQL Server 2017.

“DH2i’s expanded capabilities have made the underlying infrastructure and platform essentially irrelevant for our customers,” said OJ Ngo, co-founder and CTO of DH2i. “Our customers are able to enjoy an extremely simplistic management experience with our unified interface for Windows, Linux and Docker—all while our Smart Availability technology dynamically ensures that workloads only come online at their best execution venue.”

Introducing the Movidius Myriad X vision processing unit (VPU)The Intel subsidiary Movidius is announcing its Movidius Myriad X vision processing unit, which is intended for deep learning and AI acceleration in vision based devices. Such devices include, drones, cameras, and AR/VR headsets.

The Myriad X features a Neural Compute Engine, which lets the Myriad X achieve over one trillion operations per second of peak DNN inferencing throughput. It also comes with a Myriad Development Kit, which includes all development tools, frameworks and APIs to implement custom vision, imaging, and deep neural network workloads on the chip.

Using Preact instead of React There are plenty of alternatives to React, and one open source project thinks that it is the best choice.

With the thinnest possible Virtual DOM abstraction on top of the DOM, Preact is a “first class citizen of the web platform,” according to the Preact team.

Preact is a speedy, lightweight library option, and it’s designed to work with plenty of React components. Preach is also small enough that the code is actually the largest part of the application, according to Preact’s team, which means less JavaScript to download, parse and execute. It includes extra performance features, and it’s optimized for event handling via Linked State. Developers can use Preact for building parts of apps without complex integration.

Kaldi speech recognition gains TensorFlow deep learning support

Kaldi, an open-source speech recognition toolkit, has been updated with integration with the open-source TensorFlow deep learning library.

Developers Yishay Carmiel and Hainan Xu of Seattle-based IntelligentWire are behind the integration, and their plan is to use the combination to accelerate the advancement of automatic speech recognition (ASR) systems.

IntelligentWire specializes in cloud software that helps businesses gather analytics from live phone conversations between representatives and customers and automatically handles data entry and responding to requests. The corporation currently focuses on the contact center market, which amasses over 50 billion hours in phone calls and 25 billion hours in business application use across 22 million agents worldwide each year, according to the post.

“For an ASR system to be useful in this context, it must not only deliver an accurate transcription but do so with very low latency in a way that can be scaled to support many thousands of concurrent conversations efficiently,” Carmiel wrote in a post on the Google Developers Blog along with Staff Research Engineer at Google, Raziel Alvarez.

“For IntelligentWire, the integration of TensorFlow into Kaldi has reduced the ASR development cycle by an order of magnitude,” the post reads. “If a language model already exists in TensorFlow, then going from model to proof of concept can take days rather than weeks; for new models, the development time can be reduced from months to weeks.”

The primary issues they need to overcome in ASR are all things the developers think will be much more quickly handled with deep learning models: algorithms that are adaptable and expandable, knowing what data is valuable in the context of multiple languages and acoustic environments, and the pure computational power required to parse raw audio into something usable.

Carmiel and Xu hope that by bringing together two “vibrant” and active open-source user-bases, speech-based products and research will see an abundance of breakthroughs.

Microsoft won’t force Windows downloads on users, Go 1.9, and Project Brainwave

According to a press release on the Baden-Würtenberg consumer rights center website, Microsoft will no longer download operating system files to users’ computers without their permission.

Germany’s consumer rights center had a lengthy battle with Microsoft, since the company’s approach to the new Windows 10 operating system would force users to six gigabytes of installation files on the computer, whether the user agreed to the download or not. Microsoft finally made an announcement to avoid the continuation of legal action. The consumer rights center hoped that the resolution would come sooner, but according to reports, Microsoft’s decision could have a bearing on how the company acts in other countries.

“We would have wished for an earlier introduction, but the levy is a success for more consumer rights in the digital world,” says Cornelia Tausch, CEO of the consumer center in Baden-Württemberg. ​​”We assume that Microsoft and other software producers will pay closer attention to which process is negligible and which is not Gigabyte certainly does not belong to it.”

Go 1.9 is released
Google’s open-source Go project has been updated to version 1.9, bringing changes to the language, standard library, runtime, and tooling.

“Most of the engineering effort put into this release went to improvements of the runtime and tooling, which makes for a less exciting announcement, but nonetheless a great release,” the Go team said in the announcement on their development blog.

“The most important change to the language is the introduction of type aliases: a feature created to support gradual code repair,” the post reads.

In addition, the Go compiler has been sped up by compiling functions in a package concurrently.

MIT wants to bring rapid-prototype robotics to the masses
In “Interactive robogami: An end-to-end system for design of robots with ground locomotion,” published in The International Journal of Robotic Research, a team of researchers at MIT have outlined their origami-inspired system for bringing rapid fabrication of robots.

Interactive Robogami is a tool that lets users design ground robots which can be fabricated as flat sheets and then folded into 3D components.

“Using Interactive Robogami, designers can compose new robot designs from a database of print-and-fold parts,” the team writes in the abstract. “The designs are tested for the users’ functional specifications via simulation and fabricated on user satisfaction.”

According to their research, the tool has proven intuitive with inexperienced designers, and some of their proof-of-concept designs have had parts 3D printed and cut from sheet metal.

Microsoft’s Project Brainwave for real-time AI
Microsoft unveiled a new deep learning acceleration platform called Project Brainwave, which will be a major leap forward for performance and flexibility for cloud-based serving of deep learning models, according to Doug Burger, distinguished engineer at Microsoft, in a blog post.

Project Brainwave is built with three layers: A high-performance, distributed system architecture; a hardware DNN engine synthesized onto FPGAs; an a compiler and runtime for low-friction deployment of trained models, according to Burger.

“In the near future, we’ll detail when our Azure customers will be able to run their most complex deep learning models at record-setting performance,” writes Burger. “With the Project Brainwave system incorporated at scale and available to our customers, Microsoft Azure will have industry-leading capabilities for real-time AI.”

minoHealth is a health system developed to diagnose diseases accurately better than what the human health worker can execute. The technology, health based system, is reported to use Deep Learning to predict and diagnose medical conditions in patients – a system similarly used by few healthcare system.
They wrote on there website

“Futuristic Medical Health System seeking to Democratize Quality Healthcare with Artificial Intelligence(A.I) Medical Predictions/Diagnostics Systems, Cloud Medical Records System for Hospitals, Ministry of Health and Patients separately and “Big Data” Analytics.”

minoHealth currently has three AI healthcare systems.

•The first system predicts if a female patient would develop Diabetes in the next 5 years or not.
•The second and third systems determine if a Breast Tumor is Malignant or Benign with two separate approaches.

Deep Learning is the most effective part of Artificial Intelligence today.

minoHealth team also plans to work with Epidemiologists in Ghana and Ministry of Health to develop lots of medical datasets to train other Deep Learning models to cater to even more medical conditions and healthcare needs of Ghanaians.

The artificial intelligence community is getting a new machine learning library to boost their research efforts. Yandex announced the open source of CatBoost this week. Despite its name, CatBoost has nothing to do with cats. Instead, it has to do with gradient boosting.

“Gradient boosting is a machine learning algorithm that is widely applied to the kinds of problems businesses encounter every day like detecting fraud, predicting customer engagement and ranking recommended items like top web pages or most relevant ads. It delivers highly accurate results even in situations where there is relatively little data, unlike deep learning frameworks that need to learn from a massive amount of data,” Misha Bilenko, head of machine intelligence and research for Yandex, wrote in a post.

Features include: The ability to reduce overfitting, categorical features support, a user-friendly API for Python or R, and tools for formula analysis and training visualization. According Bilenko, one of the key things about CatBoost is it is able to provide results without extensive data training unlike traditional machine learning models.

The team hopes the library will be used for a wide variety of industrial machine learning tasks. The most common use cases will range from finance to scientific research. In addition, it can be integrated with deep learning tools such as Google’s TensorFlow.

“By making CatBoost available as an open-source library, we hope to enable data scientists and engineers to obtain top-accuracy models with no effort, and ultimately define a new standard of excellence in machine learning,” Bilenko wrote.

Top 5 projects trending on GitHub this week:
#1. Bash Snippets: A collection of small bash scripts.
#2. Awesome Guidelines: Check out this list for high quality coding style conventions and standards.
#3. Pell: A tiny WYSIWYG text editor for the web. Up from last week’s number 5 slot!
#4. Deep Learning Project: An in-depth, end to end tutorial of a machine learning pipeline from scratch
#5. Practical Node: A first edition of a book about building real-world scalable web apps.

Stanford computer scientists believe they have developed an algorithm that can diagnose heart arrhythmias with cardiologist-level accuracy. The new deep learning algorithm sifts through hours of data to find irregular heartbeats.

Typically, arrhythmias are detected with an electrocardiogram, but doctors often prescribe a wearable ECG to continuously monitor a patient’s heartbeat. This wearable device results in hundreds of hours of data. The researchers worked with a heartbeat monitor company, iRhythm, to train a deep neural network model that could accurately detect irregularities in a massive dataset.

According to the researchers, the algorithm is capable of diagnosing 14 different types of heart rhythm defects better than some trained cardiologists. They hope this will help speed up diagnosis, and improve treatment. In addition, the researchers say this could benefit people in remote locations that don’t have access to cardiologists.

“One of the big deals about this work, in my opinion, is not just that we do abnormality detection but that we do it with high accuracy across a large number of different types of abnormalities,” Awni Hannun, a graduate Stanford student, said in a statement. “This is definitely something that you won’t find to this level of accuracy anywhere else.”

For most of the past 30 years, computer vision technologies have struggled to help humans with visual tasks, even those as mundane as accurately recognizing faces in photographs.

Recently, though, breakthroughs in deep learning, an emerging field of artificial intelligence, have finally enabled computers to interpret many kinds of images as successfully as, or better than, people do.

Companies are already selling products that exploit the technology, which is likely to take over or assist in a wide range of tasks that people now perform, from driving trucks to reading scans for diagnosing medical disorders.

Recent progress in a deep-learning approach known as a convolutional neural network (CNN) is key to the latest strides. To give a simple example of its prowess, consider images of animals.

Whereas humans can easily distinguish between a cat and a dog, CNNs allow machines to categorize specific breeds more successfully than people can. It excels because it is better able to learn, and draw inferences from, subtle, telling patterns in the images.

Convolutional neural networks do not need to be programmed to recognize specific features in images—for example, the shape and size of an animal’s ears. Instead they learn to spot features such as these on their own, through training.

To train a CNN to separate an English springer spaniel from a Welsh one, for instance, you start with thousands of images of animals, including examples of either breed. Like most deep-learning networks, CNNs are organized in layers. In the lower layers, they learn simple shapes and edges from the images.

In the higher layers, they learn complex and abstract concepts—in this case, features of ears, tails, tongues, fur textures, and so on. Once trained, a CNN can easily decide whether a new image of an animal shows a breed of interest.

CNNs were made possible by the tremendous progress in graphics processing units and parallel processing in the past decade. But the Internet has made a profound difference as well by feeding CNNs’ insatiable appetite for digitized images.

Computer-vision systems powered by deep learning are being developed for a range of applications. The technology is making self-driving cars safer by enhancing the ability to recognize pedestrians. Insurers are starting to apply deep-learning tools to assess damage to cars.

In the security camera industry, CNNs are making it possible to understanding crowd behavior, which will make public places and airports safer. In agriculture, deep-learning applications can be used to predict crop yields, monitor water levels and help detect crop diseases before they spread.

Deep learning for visual tasks is making some of its broadest inroads in medicine, where it can speed experts’ interpretation of scans and pathology slides and provide critical information in places that lack professionals trained to read the images—be it for screening, diagnosis, or monitoring of disease progression or response to therapy.

This year, for instance, the U.S. Food and Drug Administration approved a deep-learning approach from the start-up Arterys for visualizing blood flow in the heart; the purpose is to help diagnose heart disease. Also this year, Sebastian Thrun of Stanford University and his colleagues described a system in Nature that classified skin cancer as well as dermatologists did. The researchers noted that such a program installed on smartphones, which are ubiquitous around the world, could provide “low-cost universal access to vital diagnostic care.”

Systems are also being developed to assess diabetic retinopathy (a cause of blindness), stroke, bone fractures, Alzheimer’s disease and other maladies.


Samsung’s latest flagships, the Galaxy S8 and the Galaxy S8+, managed to grab everyone’s attention with their slew of new additions and features. From the curved display to the lack of bezels, from the refreshed specifications to the improved software experience– the Galaxy S8 had a lot of positives going in its favor.

One feature though invited reactions against what Samsung was expecting and hoping for. The Galaxy S8 and S8+ are the first Samsung smartphones to feature Samsung’s Bixby virtual assistant. Samsung had high hopes for Bixby, so much that it incorporated a dedicated (fixed and officially non-remappable) hardware button on the devices solely to quick launch Bixby.

Unfortunately, when consumers got their hands on Galaxy S8 devices, they received only a shadow of the Bixby functionality that was promised to them in Samsung’s keynote. Bixby lacked Voice functionality in the English language on launch, with the hardware key being relegated to opening a glorified feed of suggestions, weather data and social media trends. What was to be a defining aspect of the S8 user experience simply did not materialize.

A new report originating from Korea mentions that the absence of English voice functionality on Bixby is due to lack of accumulated big data with Samsung. Big data is said to be the key to deep learning technology which learns and evolves the more users use it. Since Samsung is jumping late in the virtual assistant race, it lacks the competitive edge that other competitors like Google, Amazon and Apple have accumulated and honed over time. As a result, the English version of Bixby has been delayed.

“Developing Bixby in other languages is taking more time than we expected mainly because of the lack of the accumulation of big data.”

Samsung has already launched a test preview version of Bixby for some users in the US, but the report mentions that the preview has received mixed feedback. Many early users are reporting unsatisfactory performance in Bixby’s ability to respond to requests and questions.

Apart from the lack of accumulated big data to feed on, Samsung is also facing other challenges in developing the English version of Bixby. Geographical and language barriers are slowing down progress as engineers in the US who are working on the English version have to frequently communicate reports and wait on the management in Korea to respond. This adds on to the difficulty of developing the English version as compared to the Korean version.

Interestingly, the report also mentions that Samsung’s AI acquisition, Viv Labs, has not yet been put into complete use. Viv Lab’s AI technology has not yet been applied into Bixby, and will be used later when Bixby becomes more complete. But with the challenges currently faced by Samsung, one can only wonder when that will happen.

At ISC High Performance 2017, held in Frankfurt, Germany, deep learning is driving new computing innovation as processor manufacturers and systems developers race to deliver products optimised for deep learning applications.

Apparently, 2017 is the year that deep learning becomes a mainstream computing technology. This is good news for HPC developers as it is increasing demand for HPC hardware but there are still optimisations that must be made to fine tune both hardware and software for use in deep learning or future AI research.

Cray announced the Cray Urika-XC analytics software suite which aims to deliver analytics tools – specifically targeting analytics and deep learning to the Company’s line of Cray XC supercomputers.

Nvidia launched its PCIE based Volta V100 GPU. However, the company also demonstrated the use of its GPU technology in combination with deep learning as part of the human brain project.

HPE launched new server solutions aimed specifically at HPC and AI workloads while Mellanox highlighted its work to fine tune technology for AI and deep learning applications. Mellanox announced that deep learning frameworks such as TensorFlow, Caffe2, Microsoft Cognitive Toolkit, and Baidu PaddlePaddle can now leverage Mellanox’s smart offloading capabilities. Mellanox claims that this technology can provide near-linear scaling across multiple AI servers.

Shifting paradigms

The Cray Urika-XC solution is a set of applications and tools optimised to run seamlessly on the Cray XC supercomputing platform. In basic terms the company is taking the toolset it has developed through the Urika GX platform, optimising it for deep learning and then applying the software and toolsets to its XC series of supercomputers.

The software package is comprised of the Cray Graph Engine, the Apache Spark analytics environment, the BigDL distributed deep learning framework for Spark, the distributed Dask parallel computing libraries for analytics, and widely-used languages for analytics including Python, Scala, Java, and R.

The Cray Urika-XC analytics software suite highlights the convergence of traditional HPC and data-intensive computing – such as deep learning – as core workloads for supercomputing systems in the coming years.

As the data volumes in HPC grow the industry is responding by moving away from the previous FLOPs centric model to a more data-centric model. This requires not only innovation in parallel processing, network, and storage performance but also the software and tools used to process the vast quantities of data needed to train deep learning networks.

While deep learning is not the only trigger for this new model it exemplifies the changing paradigm of architectural design in HPC.

One example of this is the Swiss National Supercomputing Centre (CSCS) in Lugano, Switzerland which currently uses the Cray Urika-XC solution on the ‘Piz Daint,’ which, after its recent upgrade, is now one of the fastest supercomputers in the world.

‘CSCS has been responding to the increased needs for data analytics tools and services,’ said Professor Thomas Schulthess, director of the Swiss National Supercomputing Centre (CSCS). ‘We were very fortunate to participate with our Cray supercomputer Piz Daint in the early evaluation phase of the Cray Urika-XC environment. Initial performance results and scaling experiments using a subset of applications including Apache Spark and Python have been very promising.  We look forward to exploring future extensions of the Cray Urika-XC analytics software suite.’

Also this week at ISC, Nvidia announced the PCI Express version of their latest Tesla GPU accelerator, the Volta-based V100. The SXM2 form factor card was first announced earlier this year at the company’s GPU technology conference (GTC) but users can now use the more traditional PCIE slot to connect the Volta-based GPU.

It is not just hardware in the spotlight however as the company also highlighted some of the latest research that is making use of these technologies such as the Human Brain Project. Created in 2013 by the European Commission, the project’s aims include gathering, organizing and disseminating data describing the brain and its diseases, and simulating the brain itself.

Scientists at the Jülich Research Center (Forschungszentrum Jülich), in Germany, are developing a 3D multi-modal model of the human brain. They do this by analysing thousands of ultrathin histological brain slices using microscopes and advanced image analysis methods — and then reconstructing these slices into a 3D computer model.

Analysing and registering high-resolution 2D image data into a 3D reconstruction is both data and compute-intensive. To process this data as fast as possible the Julich researchers are using Jülich’s JURON supercomputer – one of two pilot systems delivered by IBM and NVIDIA to the Jülich Research Center.

The Juron cluster is composed of 18 IBM Minsky servers, each with four Tesla P100 GPU accelerators with NVIDIA NVLink interconnect technology.

Deep learning drives product innovation across the industry

Hewlett Packard Enterprise was also keen to get in on the AI action as the company launched the HPE Apollo 10 Series.

HPE Apollo 10 Series is a new platform, optimised for entry level Deep Learning and AI applications. The HPE Apollo sx40 System is a 1U dual socket Intel Xeon Gen10 server with support for up to 4 NVIDIA Tesla SXM2 GPUs with NVLink.  The HPE Apollo pc40 System is a 1U dual socket Intel Xeon Gen10 server with support for up to 4 PCIe GPU cards.

‘Today, customer’s HPC requirements go beyond superior performance and efficiency,’ said Bill Mannel, vice president and general manager, HPC and AI solutions, Hewlett Packard Enterprise. ‘They are also increasingly considering security, agility and cost control. With today’s announcements, we are addressing these considerations and delivering optimised systems, infrastructure management, and services capabilities that provide A New Compute Experience.’

Collaboration to drive AI performance

Mellanox announced that it is optimising its existing technology to help accelerate deep learning performance. The company announced that deep learning frameworks such as TensorFlow, Caffe2, Microsoft Cognitive Toolkit, and Baidu PaddlePaddle can now leverage Mellanox’ smart offloading capabilities to increase performance and, the company claims, provide near-linear scaling across multiple AI servers.

The Mellanox announcement highlights the work of the company to ensure its products can meet the requirements of users running deep learning workloads but it also demonstrates Mellanox’ willingness to work with partners, such as Nvidia, to further increase performance and integration of their individual technologies.

‘Advanced deep neural networks depend upon the capabilities of smart interconnect to scale to multiple nodes, and move data as fast as possible, which speeds up algorithms and reduces training time,’ said Gilad Shainer, vice president of marketing at Mellanox Technologies. ‘By leveraging Mellanox technology and solutions, clusters of machines are now able to learn at a speed, accuracy, and scale that push the boundaries of the most demanding cognitive computing applications.’

One of the key points of this announcement is that Mellanox is working with partners to ensure that deep learning frameworks and hardware (such as Nvidia GPUs) are compatible with Mellanox interconnect fabric to help promote the use of Mellanox networking solutions to AI/deep learning users.

More information was provided by Duncan Poole, director of platform alliances at NVIDIA: ‘Developers of deep learning applications can take advantage of optimised frameworks and NVIDIA’s upcoming NCCL 2.0 library which implements native support for InfiniBand verbs and automatically selects GPUDirect RDMA for multi-node or NVIDIA NVLink when available for intra-node communications.’