Apache Mnemonic is now a Top-Level Project

The Apache Software Foundation has announced Apache Mnemonic is graduating from the Apache Incubator to become a Top-Level Project. This signifies the project’s progress and success.

Apache Mnemonic is an open-source object platform for processing and analysis of linked objects, according to the foundation. It is designed to address Big Data performance issues such as serialization, caching, computing bottlenecks, and non-volatile memory storage media.

“The Mnemonic community continues to explore new ways to significantly improve the performance of real-time Big Data processing/analytics,” said Gang “Gary” Wang, vice president of Apache Mnemonic. “We worked hard to develop both our code and community the Apache Way, and are honored to graduate as an Apache Top-Level Project.”

The Java-based project includes a unified platform enabling framework, a durable object model and computing model, an extensible focal point for optimization, and integration with Big Data projects like Apache Hadoop and Apache Spark. The project has been used in industries like eCommerce, financial services and semiconductors.

“Apache Mnemonic fills the void of the ability to directly persist on-heap objects, making it beneficial for use in production to accelerate Big Data processing applications at several large organizations,” said Henry Saputra, ASF member and Apache Mnemonic incubating mentor. “I am pleased how the community has grown and quickly embraced the Apache Way of software development and making progressive releases. It has been a great experience to be part of this project.”

With the project, objects can be directly accessed or accessed through other computing languages such as C and C/++. According to the team’s website, it is currently working on pure Java memory service, durable object vectorization and durable query service features.

“Apache Mnemonic provides a unified interface for memory management,” said Yanhui Zhao, Apache Mnemonic committer. “It is playing a significant role in reshaping the memory management in current computer architecture along with the developments of large capacity NVMs, making a smooth transition from present mechanical-based storage to flash-based storage with the minimum cost.”

Facebook open sources new build features for Android developers

Facebook is building on its open-source performance build tool, Buck, to speed up development and minimize the time it takes to test code changes in Android apps.

Buck is designed to speed up builds, add reproducibility to builds, provide correct incremental builds, and help developers understand dependencies. The company first open sourced the solution in 2013.

“We’ve continued to steadily improve Buck’s performance, together with a growing community of other organizations that have adopted Buck and contributed back. But these improvements have largely been incremental in nature and based on long-standing assumptions about the way software development works,” Jonathan Keljo, software engineer at Facebook, wrote in a post. “We took a step back and questioned some of these core assumptions, which led us deep into the nuances of the Java language and the internals of the Java compiler.”

According to Keljo, the team has completely redesigned the way Buck compiles Java code in order to provide new performance improvements for Android engineers.

The solution is also introducing rule pipelining, which Keljo says is designed to shorten bottlenecks, and increases parallelism to reduce build times by 10 percent.

“Buck is usually able to build multiple rules in parallel. However, bottlenecks do occur. If a commonly used rule takes awhile to build, its dependents have to wait. Even small rules can cause bottlenecks on systems with a high enough number of cores,” Keljo wrote.

Rule pipelining now enables dependent rules to compile while the compiler is still finishing up dependencies. This feature is now available in open source, but is not turned on by default.

The company is also announcing source-only stub generation to flatten the dependency graph and reduce build times by 30 percent.

“Flatter graphs produce faster builds, both because of increased parallelism and because the paths that need to be checked for changes are shorter,” Keljo wrote.

More information is available here.

Node.js 9 released as version 8 enters long-term support

The Node.js Foundation announced the release of version 9 of the Node.js JavaScript runtime today, while Node.js 8 is going into long-term support.

The community-driven, open-source library has seen use in enterprise applications, robotics, API toolkits, serverless apps, mobile websites and others, according to the foundation, and long-term support means companies should begin migrating to version 8.

The foundation says version 8 was one the biggest releases from the platform’s community, bringing 20% faster performance in web apps than the previous version 6.

“A top priority for enterprises is to ensure applications are performant. New features like HTTP/2 and the latest V8 JavaScript Engine deliver the performance global organizations require to run their business at scale,” said Mark Hinkle, executive director of the Node.js Foundation. “Node.js builds are growing faster than ever thanks to the long-term support strategy, and the large and active community that includes 13 working groups, 21 members of the Technical Steering Committee, and more than 1,600 contributors to Node.js united under the Node.js Foundation.”

Updates in Node.js 8 bring V8 JavaScript Engine 6.1 and HTTP/2 support on board, along with updates to the Node.js API allowing for better stability and future-proofing for backwards compatibility, which the Node.js Foundation says moves it towards VM neutrality and opening Node.js to more environments, like IoT and mobile.

Other Node.js 8 features include a stable module API, async / await support for writing more linear code, and the V8 JavaScript Engine 6.1.

While Node.js 9 is available, the foundation says that it’s geared more towards testing and experimenting with bleeding-edge features and recommends Node.js 8 for ongoing development.

Microsoft and Amazon announce deep learning library Gluon

Microsoft has announced a new partnership with Amazon to create a open-source deep learning library called Gluon. The idea behind Gluon is to make artificial intelligence more accessible and valuable.

According to Microsoft, the library simplifies the process of making deep learning models and will enable developers to run multiple deep learning libraries. This announcement follows their introduction of the Open Neural Network Exchange (ONNX) format, which is another AI ecosystem.

Gluon supports symbolic and imperative programming, which is something not supported by many other toolkits, Microsoft explained. It also will support hybridization of code, allowing compute graphs to be cached and reused in future iterations. It offers a layers library that reuses pre-built building blocks to define model architecture. Gluon natively supports loops and ragged tensors, allowing for high execution efficiency for RNN and LSTM models, as well as supporting sparse data and operations. It also provides the ability to do advanced scheduling on multiple GPUs.

“This is another step in fostering an open AI ecosystem to accelerate innovation and democratization of AI-making it more accessible and valuable to all,” Microsoft wrote in a blog post. “With Gluon, developers will be able to deliver new and exciting AI innovations faster by using a higher-level programming model and the tools and platforms they are most comfortable with.”

The library will be available for Apache MXNet or Microsoft Cognitive Toolkit. It is already available on GitHub for Apache MXNet, with Microsoft Cognitive Toolkit support on the way.

GitHub Universe outlines plans for the future of software development

About ten years ago, GitHub embarked on a journey to create a platform that brought together the world’s largest developer community. Now that the company believes it has reached its initial goals, it is looking to the future with plans to expand the ecosystem and transform the way developers code through new tools and data.

“Development hasn’t had that much innovation arguably in the past 20 years. Today, we finally get to talk about what we think is the next 20 years, and that is development that is fundamentally different and driven by data,” said Miju Han, engineering manager of data science at GitHub.

The company announced new tools at its GitHub Universe conference in San Francisco that leverages its community data to protect developer code, provide greater security, and enhance the GitHub experience.

“It is clear that security is desperately needed for all of our users, open source and businesses alike. Everyone using GitHub needs security. We heard from our first open source survey this year that open source users view security and stability above all else, but at the same time we see that not everyone has the bandwidth to have a security team,” said Han.

GitHub is leveraging its data to help developers manage the complexity of dependencies in their code with the newly announced dependency graph. The dependency graph enables developers to easily keep track of their packages and applications without leaving their repository. It currently supports Ruby and JavaScript, with plans to add Python support in the near future.

In addition, the company revealed new security alerts that will use human data and machine learning to track when dependencies are associated with public security vulnerabilities, and recommend a security fix for it.

“This is one of the first times where we are going from hosting code to saying this is how it could be better, this is how it could be different,” said Han.

On the GitHub experience side, the company announced the ability to discover new projects with news feed and explore capabilities. “We want people to dig deeper into their interests and learn more, which is one of the core things it means to be a developer,” said Han.

The new news feed capabilities allows users to discover repositories right from their dashboard, and gain recommendations on open source projects to explore. The recommendations will be based off of people users are following, their starred repositories, and popular GitHub project.

“You’re in control of the recommendations you see: Want to contribute to more Python projects? Star projects like Django or pandas, follow their maintainers, and you’ll find similar projects in your feed. The ‘Browse activity’ feed in your dashboard will continue to bring you the latest updates directly from repositories you star and people you follow,” the company wrote in a blog.

The “Explore” experience has been completely redesigned to connect users with curated collections, topics, and resources so they can dig into a specific interest like machine learning or data protection, according to Han.

Han went on to explain that the newly announced features are just the beginning of how the company plans to take code, make it better, and create an ecosystem that helps developers move forward.

“These experiences are a first step in using insights to complement your workflow with opportunities and recommendations, but there’s so much more to come. With a little help from GitHub data, we hope to help you find work you’re interested in, write better code, fix bugs faster, and make your GitHub experience totally unique to you,” the company wrote.

EdgeX Foundry launches first major code release

Linux Foundation open source project EdgeX Foundry has launched the first major code release of their common open framework for IoT edge computing, Barcelona, originally announced in April. The release features key API stabilization, better code quality, reference Device Services supporting BACNet, Modbus, Bluetooth Low Energy (BLE), MQTT, SNMP, and Fischertechnik, and double the test coverage across EdgeX microservices.

Barcelona is the result of collaboration between over 50 member organizations and aims to provide an ecosystem for Industrial IoT solutions. These members provide products that support analytics, visualization, security, and more. Over 150 across the globe have met to establish project goals, working groups, and project maintainers and committers.

EdgeX says that the complexity of the IoT landscape has caused issues among businesses looking to deploy their own IoT solutions. EdgeX Foundry hopes to solve these issues by building this open source framework. The framework will be built on plug-and-play components designed to accelerate the deployment of IoT solutions.

“Barcelona is a significant milestone that showcases the commercial viability of EdgeX and the impact that it will have on the global Industrial IoT Landscape,” said Philip DesAutels, senior director of IoT at The Linux Foundation.

EdgeX has established a bi-annual release roadmap and their next major release, “California,” is planned for next Spring. California will continue to expand the framework to support requirements for deployment in business-critical Industrial IoT applications. According to the company, “In addition to general improvements, planned features for the California release include baseline APIs and reference implementations for security and manageability value-add.”

Live demonstrations of this platform will be taking place at IoT Solutions World Congress in Barcelona, Spain this week.

The modern digital enterprise collects data on an unprecedented scale. Andrew Ng, currently at startup deeplearning.ai, formerly chief scientist at Chinese internet giant Baidu and co-founder of education startup Coursera, says, like electricity 100 years ago, “AI will change pretty much every major industry.” Machine Learning (ML) is a popular application of AI that refers to the use of algorithms that iteratively learn from data. ML, at its best, allows companies to find hidden insights in data without explicitly programming where to look.

Applications built based on ML are proliferating quickly. The list of well-known uses is long and growing every day. Apple’s Siri, Amazon’s recommendation engine, and IBM’s Watson are just a few prominent examples. All of these applications sift through incredible amounts of data and provide insights mapped to users’ needs.

Why is ML exploding in popularity? It is because the foundational technology in ML is openly available and accessible to organizations without specialized skill sets. Open source provides key technologies that make ML easy to learn, integrate and deploy into existing applications. This has lowered the barrier to entry and quickly opened ML to a much larger audience.

In the past two years, there has been an explosion of projects and development tools. The vast majority of consequential ones are open source. TensorFlow, just one key example, is a powerful system for building and training neural networks to detect and decipher patterns and correlations, similar to human learning and reasoning. It was open-sourced by Google at the end of 2015.

Main Languages for ML – Open Source Dominates

Open source programming languages are extremely popular in ML due to widespread adoption, supportive communities, and advantages for quick prototyping and testing.

For application languages, Python has a clear lead with interfaces and robust tools for almost all ML packages. Python has the added benefit of practically ubiquitous popularity. It is easy to integrate with applications and provides a wide ecosystem of libraries for web development, microservices, games, UI, and more.

Beyond Python, other open-source languages used in ML include R, Octave, and Go, with more coming along. Some of these, like R and Octave, are statistical languages that have a lot of the tools for working with data analysis and working within a sandbox. Go, developed and backed by Google, is new and is an excellent server and systems language with a growing library of data science tools. Its advantages include compiled code and speed. Its adoption rates are increasing dramatically.

Python Tools and Libraries for ML – An Introduction

The amazing strength of open source is in the proliferation of powerful tools and libraries that get you up and running quickly. At the core of the Python numerical/scientific computing ecosystem are NumPy and SciPy. NumPy and SciPy are foundational libraries on top of which many other ML and data science packages are built. NumPy provides support for numerical programming in Python. NumPy has been in development since 2006 and just received US$645,000 in funding this summer.

SciKit-Learn, with 20k stars and 10.7k forks, provides simple and efficient tools for data mining and data analysis. It is accessible to everybody, and reusable in various contexts. Built on NumPy, SciPy, and matplotlib, SciKit-Learn is very actively maintained and supports a wide variety of the most common algorithms including Classification, Regression, Clustering, Dimensionality Reduction, Model Selection, and Preprocessing. This is open source that is immediately ready for commercial implementation.

Keras is a Python Deep Learning library that allows for easy and fast prototyping and does not need significant ML expertise. It has been developed with a focus on enabling fast experimentation and being able to go from idea to result with the least possible delay. Keras can use TensorFlow, Microsoft Cognitive Toolkit (CNTK) or Theano as its backend, and you can swap between the three. Keras has 17.7k stars and 6.3k forks. Keras supports both convolutional networks and recurrent networks, as well as combinations of the two, and runs seamlessly on CPU and GPU.

TensorFlow is Google’s library for ML, which expresses calculations as a computation graph. With 64k stars and 31k forks, it is possibly one of the most popular projects on all GitHub and is becoming the standard intermediate format for many ML projects. Python is the recommended language by Google, though there are other language bindings.

These three superstar foundational ML tools are all open source and represent just a taste of the many important applications available to companies building ML strategies.

The Importance of ML Open Source Communities

Open source is built by communities that connect developers, users and enthusiasts in a common endeavor. Developers get useful examples and a feeling that others are extending the same topics. Communities provide examples, support and motivation that proprietary tools often lack. This also lowers the barrier to entry. Plus, many active ML communities are backed by large players like Google, Microsoft, Apple, Amazon, Apache and more.

Linux Foundation wants to promote sustainable open source development with new initiatives

During last week’s Open Source Summit North America in Los Angeles, the Linux Foundation announced a series of projects designed to promote sustainability and growth in open source development.

We wrote last week about their “Open Source Guides for the Enterprise,” which will see a series of guides by professionals from many different organizations released over the next few months.

Following that, the foundation announced the Community Health Analytics for Open Source Software, or CHAOSS, project. With CHAOSS, the Linux Foundation wants to provide a platform for measuring and analyzing open source projects.

The foundation also announced that it has granted a CII security badge to 100 projects through a voluntary process for open source projects to prove their security measures stack up professionally.

And finally, the foundation is involved in the Kubernetes Certified Service Provider project, which allows companies already versed in Kubernetes technology to become certified support for enterprises hopping on the rapidly growing container management system.

In a post on the foundation’s blog, Linux Foundation Executive Director Jim Zemlin explained why these projects will be important.

“The big question we ask ourselves at The Linux Foundation is: Of the 64 million open-source projects out there, which are the ones that really matter?” he wrote. “We think that projects with sustainable ecosystems are the ones that really matter. These are the open-source projects that will be supported. They provide the security and quality codebase that you can build future technologies on.”

Zemlin says that the many open source projects in active development at the Linux Foundation and influential projects coming from organizations like the Apache Software Foundation, the Eclipse Foundation and the OpenStack Foundation, all follow the sort of development principles that he believes will promote sustainability.

With these sorts of guidelines and support available, Zemlin says it will become clearer and easier for enterprises to evaluate which open-source projects are worth using and contributing to, which in turn will promote the growth of these worthwhile projects.

GitHub project of the week: Yarn 1.0

The team behind Yarn, an open-source, fast and secure alternative npm client, announced the 1.0 release of the JavaScript package manager, which is a major step for the project. In the 11 month since its initial release in 2016 has generated more than 175,000 projects on GitHub, and it’s responsible for nearly three billion package downloads per month.

So what’s new in Yarn? Yarn added a new feature called Workspaces, which lets people automatically aggregate all the dependencies from multiple package.json files and install them all in one go. It also uses a single yarn.lock file at the root, to lock them all, according to a Facebook post debuting the Yarn 1.0 release.

Workspaces is used by some teams at Facebook already, like in Babel. Lerna, a mono-repository management tool lets you opt in to Yarn’s Workspaces.

“By making Workspaces native to Yarn, we hope to enable faster and lighter installations by preventing package duplication between the smaller parts of a larger project,” read the Facebook blog.
Also in Yarn 1.0 is the new auto-merging of lockfiles feature. When there’s a merge conflict in the lockfile, Yarn will automatically handle the conflict resolution for you upon running yarn install, according to the blog. And if it succeeds, the conflict-free lockfile will save to a disk.

The next time you have a lockfile conflict, you can save time by running yarn install instead of doing a manual resolution, according to the Yarn team.

Besides some of the top new features, Yarn also improved its interactive upgrade experience, it includes a faster file integrity check, and there’s a separate lockfile parser module that you can use in your project.

Top five projects trending on GitHub this week

#1. Every Programmer Should Know: A collection of (mostly) technical things every software developer should know

#2. R2: HTTP client. Spiritual successor to request.

#3. WTFPython: A collection of interesting, subtle, and tricky Python snippets.

#4. Easy Mock: A persistent service that generates mock data quickly and provides visualization view.

#5. Clean Code PHP:🛁 Clean Code concepts adapted for PHP

AnsibleFest: Red Hat announces Ansible Engine, new Ansible innovations for the enterprise

Ansible is quickly becoming one of the world’s most popular open source IT automation technologies, with nearly 3,000 unique contributors and more than 32,000 commits to the upstream Ansible project. At Red Hat’s AnsibleFest in San Franscisco, the company announced several new Ansible innovations aimed at enterprise customers that want to utilize Ansible and continue to adopt automation technology at the enterprise level.

According to Joe Fitzgerald, vice president of management at Red Hat, the company has seen a strong interest in Ansible, especially since enterprise customers are interested in more stable and reliable technologies. And, as “more automation implementations directly impact mission-critical business applications and environments, the requirements for greater security, support and stability become even more important,” said Fitzgerald.

One offering announced at the conference is the Red Hat Ansible Engine, a new offering designed to bring the enterprise-grade global support to the Ansible automation community project. Enterprises can use Ansible Engine to access tools and innovations available from the Ansible technology in a hardened, enterprise-grade manner, according to a Red Hat statement today.

Ansible Engine features a reliable and enterprise-ready set of Ansible automation, modules and capabilities, support from Red Hat, and benefits from a Red Hat subscription, including Open Source Assurance, Service Level Agreement (SLA) response, regular security and maintenance updates, and more.

Red Hat Ansible Engine is also available with new Networking Add-on, which includes engineering support for Ansible modules like Arista (EOS), Cisco (IOS, IOS-XR, NX-OS), Juniper (Junos OS), Open vSwitch, and VyOS.

In addition to the new Ansible Engine, Red Hat also announced Red Hat Ansible Tower 3.2, with new and enhanced features that provide teams with updated inventory support, Tower Instance groups, Tower isolated nodes, SCM inventory support, pluggable credentials, and more.  Also, Ansible Tower 3.2 is the first version that is based on the open source AWX project, which is a new open source community project sponsored by Red Hat. It enables users to directly interact and add features or capabilities to drive innovations towards Ansible Tower, according to a Red Hat statement.