AI Paper English F.o.R.

人工知能(AI)に関する論文を英語リーディング教本のFrame of Reference(F.o.R.)を使いこなして読むブログです。

2019-08-01から1ヶ月間の記事一覧

LeNet | Abstract 第1段落 第3文

This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Yann LeCun, et al., "Gradient-Based Learning Applied to Document Recognition" http://yann.l…

LeNet | Abstract 第1段落 第2文

Given an appropriate network architecture, Gradient-Based Learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. Yan…

LeNet | Abstract 第1段落 第1文

Multilayer Neural Networks trained with the backpropagation algorithm constitute the best example of a successful Gradient-Based Learning technique. Yann LeCun, et al., "Gradient-Based Learning Applied to Document Recognition" http://yann.…

Deep Learning | Abstract 第4文

Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech. Yann LeCun, Yoshua Bengio & Geoffrey Hinton, "Deep …

Deep Learning | Abstract 第3文

Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the repres…

Deep Learning | Abstract 第2文

These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Yann LeCun, Yoshua Bengio & Geoffrey Hinton, "Deep…

Deep Learning | Abstract 第1文

Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. Yann LeCun, Yoshua Bengio & Geoffrey Hinton, "Deep Learning" https://www.nature…

Adversarial Examples | Abstract 第6文

Using this approach to provide examples for adversarial training, we reduce the test set error of a maxout network on the MNIST dataset. Ian J. Goodfellow, "Explaining and Harnessing Adversarial Examples" https://arxiv.org/abs/1412.6572 ニ…

Adversarial Examples | Abstract 第5文

Moreover, this view yields a simple and fast method of generating adversarial examples. Ian J. Goodfellow, "Explaining and Harnessing Adversarial Examples" https://arxiv.org/abs/1412.6572 ニューラルネットワークを騙すような入力となるAdversa…

Adversarial Examples | Abstract 第4文

This explanation is supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets. Ian J. Goodfellow, "Explaining and Harnessing…

Adversarial Examples | Abstract 第3文

We argue instead that the primary cause of neural networks’ vulnerability to adversarial perturbation is their linear nature. Ian J. Goodfellow, "Explaining and Harnessing Adversarial Examples" https://arxiv.org/abs/1412.6572 ニューラルネ…

Adversarial Examples | Abstract 第2文

Early attempts at explaining this phenomenon focused on nonlinearity and overfitting. Ian J. Goodfellow, "Explaining and Harnessing Adversarial Examples" https://arxiv.org/abs/1412.6572 ニューラルネットワークを騙すような入力となるAdversari…

Adversarial Examples | Abstract 第1文

Several machine learning models, including neural networks, consistently misclassify adversarial examples—inputs formed by applying small but intentionally worst-case perturbations to examples from the dataset, such that the perturbed inpu…

AMSGrad | Abstract 第5文

Our analysis suggests that the convergence issues can be fixed by endowing such algorithms with “long-term memory” of past gradients, and propose new variants of the ADAM algorithm which not only fix the convergence issues but often also l…

AMSGrad | Abstract 第4文

We provide an explicit example of a simple convex optimization setting where ADAM does not converge to the optimal solution, and describe the precise problems with the previous analysis of ADAM algorithm. Sashank J. Reddi, et al., "On the …

AMSGrad | Abstract 第3文

We show that one cause for such failures is the exponential moving average used in the algorithms. Sashank J. Reddi, et al., "On the Convergence of Adam and Beyond" https://arxiv.org/abs/1904.09237 有用な勾配を忘却しないようにするlong-term…

AMSGrad | Abstract 第2文

In many applications, e.g. learning with large output spaces, it has been empirically observed that these algorithms fail to converge to an optimal solution (or a critical point in nonconvex settings). Sashank J. Reddi, et al., "On the Con…

AMSGrad | Abstract 第1文

Several recently proposed stochastic optimization methods that have been successfully used in training deep networks such as RMSPROP, ADAM, ADADELTA, NADAM are based on using gradient updates scaled by square roots of exponential moving av…

Transformer | Abstract 第7文

We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data. Ashish Vaswani, et al., "Attention Is All You Need" https://arxiv.org/abs/1…

Transformer | Abstract 第6文

On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the …

Transformer | Abstract 第5文

Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles, by over 2 BLEU. Ashish Vaswani, et al., "Attention Is All You Need" https://arxiv.org/abs/1706.…

Transformer | Abstract 第4文

Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Ashish Vaswani, et al., "Attention Is All You Need" https://arxiv.org/…

Transformer | Abstract 第3文

We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Ashish Vaswani, et al., "Attention Is All You Need" https://arxiv.org/abs/1706.03762…

Transformer | Abstract 第2文

The best performing models also connect the encoder and decoder through an attention mechanism. Ashish Vaswani, et al., "Attention Is All You Need" https://arxiv.org/abs/1706.03762 RNNやCNNを使わず、Attentionのみを使用した機械翻訳モデルで…

Transformer | Abstract 第1文

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks that include an encoder and a decoder. Ashish Vaswani, et al., "Attention Is All You Need" https://arxiv.org/abs/1706.03762 RNNやCNN…