AI Paper English F.o.R.

人工知能(AI)に関する論文を英語リーディング教本のFrame of Reference(F.o.R.)を使いこなして読むブログです。

Fast R-CNN | Abstract 第2文

Fast R-CNN builds on previous work to efficiently classify object proposals using deep convolutional networks. Ross Girshick, "Fast R-CNN" https://arxiv.org/abs/1504.08083 物体検出タスクにおいて、特徴マップの再利用によってR-CNNよりも高速化…

Fast R-CNN | Abstract 第1文

This paper proposes a Fast Region-based Convolutional Network method (Fast R-CNN) for object detection. Ross Girshick, "Fast R-CNN" https://arxiv.org/abs/1504.08083 物体検出タスクにおいて、特徴マップの再利用によってR-CNNよりも高速化に成功…

R-CNN | Abstract 第8文

Source code for the complete system is available at http://www.cs.berkeley.edu/~rbg/rcnn. Ross Girshick, et al., "Rich feature hierarchies for accurate object detection and semantic segmentation" https://arxiv.org/abs/1311.2524 物体検出に…

R-CNN | Abstract 第7文

We find that R-CNN outperforms OverFeat by a large margin on the 200-class ILSVRC2013 detection dataset. Ross Girshick, et al., "Rich feature hierarchies for accurate object detection and semantic segmentation" https://arxiv.org/abs/1311.2…

R-CNN | Abstract 第6文

We also compare R-CNN to OverFeat, a recently proposed sliding-window detector based on a similar CNN architecture. Ross Girshick, et al., "Rich feature hierarchies for accurate object detection and semantic segmentation" https://arxiv.org…

R-CNN | Abstract 第5文

Since we combine region proposals with CNNs, we call our method R-CNN: Regions with CNN features. Ross Girshick, et al., "Rich feature hierarchies for accurate object detection and semantic segmentation" https://arxiv.org/abs/1311.2524 物…

R-CNN | Abstract 第4文

Our approach combines two key insights: (1) one can apply high-capacity convolutional neural networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data is scarce, supervised pr…

R-CNN | Abstract 第3文

In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012—achieving a mAP of 53.3%. Ross Girshick, et al., "Rich feature…

R-CNN | Abstract 第2文

The best-performing methods are complex ensemble systems that typically combine multiple low-level image features with high-level context. Ross Girshick, et al., "Rich feature hierarchies for accurate object detection and semantic segmenta…

R-CNN | Abstract 第1文

Object detection performance, as measured on the canonical PASCAL VOC dataset, has plateaued in the last few years. Ross Girshick, et al., "Rich feature hierarchies for accurate object detection and semantic segmentation" https://arxiv.org…

Transformer | Title "Attention is all you need."

Attention is all you need. Ashish Vaswani, et al., "Attention Is All You Need" https://arxiv.org/abs/1706.03762 RNNやCNNを使わず、Attentionのみを使用した機械翻訳モデルであるTransformerの論文の"Attention Is All You Need"のTitleについて、英…

Haar-like | Abstract 第8文

Used in real-time applications, the detector runs at 15 frames per second without resorting to image differencing or skin color detection. Paul Viola and Michael Jones, "Rapid Object Detection using a Boosted Cascade of Simple Features" ht…

Haar-like | Abstract 第7文

In the domain of face detection the system yields detection rates comparable to the best previous systems. Paul Viola and Michael Jones, "Rapid Object Detection using a Boosted Cascade of Simple Features" https://www.cs.cmu.edu/~efros/cour…

Haar-like | Abstract 第6文

The cascade can be viewed as an object specific focus-of-attention mechanism which unlike previous approaches provides statistical guarantees that discarded regions are unlikely to contain the object of interest. Paul Viola and Michael Jon…

Haar-like | Abstract 第5文

The third contribution is a method for combining increasingly more complex classifiers in a “cascade” which allows background regions of the image to be quickly discarded while spending more computation on promising object-like regions. Pa…

Haar-like | Abstract 第4文

The second is a learning algorithm, based on AdaBoost, which selects a small number of critical visual features from a larger set and yields extremely efficient classifiers. Paul Viola and Michael Jones, "Rapid Object Detection using a Boo…

Haar-like | Abstract 第3文

The first is the introduction of a new image representation called the “Integral Image” which allows the features used by our detector to be computed very quickly. Paul Viola and Michael Jones, "Rapid Object Detection using a Boosted Casca…

Haar-like | Abstract 第2文

This work is distinguished by three key contributions. Paul Viola and Michael Jones, "Rapid Object Detection using a Boosted Cascade of Simple Features" https://www.cs.cmu.edu/~efros/courses/LBMV07/Papers/viola-cvpr-01.pdf ディープラーニン…

Haar-like | Abstract 第1文

This paper describes a machine learning approach for visual object detection which is capable of processing images extremely rapidly and achieving high detection rates. Paul Viola and Michael Jones, "Rapid Object Detection using a Boosted …

SIFT | Abstract 第6文

This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance. David G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints" https://www.cs.ubc.ca/~lowe/paper…

SIFT | Abstract 第5文

The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally per…

SIFT | Abstract 第4文

This paper also describes an approach to using these features for object recognition. David G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints" https://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf ディープラーニングではなく2004年…

SIFT | Abstract 第3文

The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. David G. Lowe, "Distinctive Image Features from Scale-Invariant K…

SIFT | Abstract 第2文

The features are invariant to image scale and rotation, and are shown to provide robust matching across a a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. David G. Lowe, "Dist…

SIFT | Abstract 第1文

This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. David G. Lowe, "Distinctive Image Features from Scale-Invar…

HOG | Abstract 第4文

HOG

The new approach gives near-perfect separation on the original MIT pedestrian database, so we introduce a more challenging dataset containing over 1800 annotated human images with a large range of pose variations and backgrounds. Navneet D…

HOG | Abstract 第3文

HOG

We study the influence of each stage of the computation on performance, concluding that fine-scale gradients, fine orientation binning, relatively coarse spatial binning, and high-quality local contrast normalization in overlapping descrip…

HOG | Abstract 第2文

HOG

After reviewing existing edge and gradient based descriptors, we show experimentally that grids of Histograms of Oriented Gradient (HOG) descriptors significantly outperform existing feature sets for human detection. Navneet Dalal and Bill…

HOG | Abstract 第1文

HOG

We study the question of feature sets for robust visual object recognition, adopting linear SVM based human detection as a test case. Navneet Dalal and Bill Triggs, "Histograms of Oriented Gradients for Human Detection" https://lear.inrial…

ZFNet | Abstract 第7文

We show our ImageNet model generalizes well to other datasets: when the softmax classifier is retrained, it convincingly beats the current state-of-the-art results on Caltech-101 and Caltech-256 datasets. Matthew D Zeiler, Rob Fergus, "Vis…