The AI Index 2017 Report is a starting point for the conversation about rigorously measuring activity and progress in AI. This report aggregates a diverse set of data, makes that data accessible, and includes discussion about what is provided and what is missing.
The new feature will scan all posts for patterns of suicidal thoughts, and when necessary send mental health resources to the user at risk or their friends, or contact local first-responders. By using AI to flag worrisome posts to human moderators instead of waiting for user reports, Facebook can decrease how long it takes to send help.
AWS announced multiple
AI-related services and initiatives at their re:Invent conference, including NLP, video analysis, and model deployment services. It also announced the
AWS Machine Learning Research Awards, a new program that funds university departments, faculty, PhD students, and post-docs.
With the intention of integrating the new technology into its own assistant, Bixby, Samsung acquired the South Korean company Fluenty. The startup offered automated text replies for SMS and third-party social services.
Posts, Articles, Tutorials
Sequence Modeling with CTC
A visual guide to Connectionist Temporal Classification (CTC), an algorithm used to train deep neural networks in speech recognition, handwriting recognition and other sequence problems.
The impossibility of Intelligence Explosion
The concept of an “intelligence explosion” — leading to the sudden rise of “superintelligence” and the accidental end of the human race — has taken hold in the AI community. This post argues that such an event is impossible — that the notion of intelligence explosion comes from a profound misunderstanding of both the nature of intelligence and the behavior of recursively self-augmenting systems.
Specifying AI safety problems in simple environments
As AI systems become more general and more useful in the real world, ensuring they behave safely will become even more important. The post and paper introduce a selection of simple reinforcement learning environments (
code here) designed specifically to measure “safe behaviors”.
Read the paper here.
Model-based RL with Neural Network Dynamics
The sample inefficiency of deep reinforcement learning methods is one of the main bottlenecks to leveraging learning-based methods in the real world. This post investigates a sample-efficient approach for robot control. The technique is able to learn locomotion skills of trajectory-following using only minutes of data collected from the robot randomly acting in the environment.
Code, Projects & Data
High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs (Stunning Video)
Watch this stunning video. A new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional GANs. The result are stunning 2048x1024 visually results with a novel adversarial loss, as well as new multi-scale generator and discriminator architectures.
StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation
DeepMind's pycolab Game Engine
pycolab is a highly-customisable gridworld game engine with some batteries included. It allows you to make your own gridworld games to test reinforcement learning agents. It was used for the recent
AI Safety Gridworlds paper.
Deep Image Prior
No pre-trained network or image database needed. A randomly-initialized neural network can be used as a handcrafted prior with excellent results in standard inverse problems such as image denoising, superresolution, and inpainting.
Paper and
code are available.
Mozilla’s Speech Recognition Model and Voice Dataset
Mozilla’s research team announced the initial release of its
open source speech recognition model, as well as the world’s second-largest publicly available voice
dataset, which was contributed to by nearly 20,000 people globally.
Highlighted Research Papers
[1711.09534] Neural Text Generation: A Practical Guide
In text generation models the decoder can behave in undesired ways, such as by generating truncated or repetitive outputs, outputting bland and generic responses, or in some cases producing ungrammatical gibberish. This paper is intended as a practical guide for resolving such undesired behavior in text generation models, with the aim of helping enable real-world applications.
[1711.09020] StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation
StarGAN is a novel and scalable approach that can perform image-to-image translations for multiple domains using a single model. The unified model architecture of StarGAN allows simultaneous training of multiple datasets with different domains within a single network.
[1711.09784] Distilling a Neural Network Into a Soft Decision Tree
It is hard to explain why a learned network makes a particular classification decision on a particular test case. This is due to their reliance on distributed hierarchical representations. The authors describe a way of using a trained neural net to create a type of soft decision tree that generalizes better than one learned directly from the training data.
[1711.10337] Are GANs Created Equal? A Large-Scale Study
Despite a very rich research activity leading to numerous interesting GAN algorithms, it is still very hard to assess which algorithm(s) perform better than others. The authors conduct a large-scale empirical study on state-of-the art models and evaluation measures and find that most models can reach similar scores with enough hyperparameter optimization and random restarts.