## SSD: Single Shot MultiBox Detector

• 预设框的中心为特征像素点的中心
• 对于长宽比  , 尺寸 , 生成的预设框大小  。
• 对于长宽比  同时  , 生成的预设框大小  , 其中  是第一个预设的尺寸。

``````import mxnet as mx
from mxnet import nd
from mxnet.contrib.ndarray import MultiBoxPrior

n = 40
# 输入形状: batch x channel x height x weight
x = nd.random_uniform(shape=(1, 3, n, n))

y = MultiBoxPrior(x, sizes=[.5, .25, .1], ratios=[1, 2, .5])

# 取位于 (20,20) 像素点的第一个预设框
# 格式为 (x_min, y_min, x_max, y_max)
boxes = y.reshape((n, n, -1, 4))
print('The first anchor box at row 21, column 21:', boxes[20, 20, 0, :])``````

The first anchor box at row 21, column 21:
[ 0.26249999 0.26249999 0.76249999 0.76249999]
<NDArray 4 @cpu(0)>

``````import matplotlib.pyplot as plt
def box_to_rect(box, color, linewidth=3):
"""convert an anchor box to a matplotlib rectangle"""
box = box.asnumpy()
return plt.Rectangle(
(box[0], box[1]), (box[2]-box[0]), (box[3]-box[1]),
fill=False, edgecolor=color, linewidth=linewidth)
colors = ['blue', 'green', 'red', 'black', 'magenta']
plt.imshow(nd.ones((n, n, 3)).asnumpy())
anchors = boxes[20, 20, :, :]
for i in range(anchors.shape[0]):
plt.show()``````

• 通道  的值对应背景（非物体）的得分
• 通道  对应了第  类的得分
``````from mxnet.gluon import nn
def class_predictor(num_anchors, num_classes):
"""return a layer to predict classes"""
return nn.Conv2D(num_anchors * (num_classes + 1), 3, padding=1)

cls_pred = class_predictor(5, 10)
cls_pred.initialize()
x = nd.zeros((2, 3, 20, 20))
print('Class prediction', cls_pred(x).shape)``````

Class prediction (2, 55, 20, 20)

``````def box_predictor(num_anchors):
"""return a layer to predict delta locations"""
return nn.Conv2D(num_anchors * 4, 3, padding=1)

box_pred = box_predictor(10)
box_pred.initialize()
x = nd.zeros((2, 3, 20, 20))
print('Box prediction', box_pred(x).shape)``````

Box prediction (2, 40, 20, 20)

``````def down_sample(num_filters):
"""stack two Conv-BatchNorm-Relu blocks and then a pooling layer
to halve the feature size"""
out = nn.HybridSequential()
for _ in range(2):
return out

blk = down_sample(10)
blk.initialize()
x = nd.zeros((2, 3, 20, 20))
print('Before', x.shape, 'after', blk(x).shape)``````

Before (2, 3, 20, 20) after (2, 10, 10, 10)

SSD算法的一个关键点在于它用到了多尺度的特征层来预测不同大小的物体。相对来说，浅层的特征层的空间尺度更大，越到网络的深层，空间尺度越小，最后我们往往下采样直到  ，用来预测全图大小的物体。所以每个特征层产生的预设框， 分类概率，框偏移量需要被整合起来统一在全图与真实的物体比较。 为了做到一一对应，我们统一把所有的预设框， 分类概率，框偏移量 平铺再连接。得到的是按顺序排列但是摊平的所有预测值和预设框。

``````# 随便创建一个大小为 20x20的预测层
feat1 = nd.zeros((2, 8, 20, 20))
print('Feature map 1', feat1.shape)
cls_pred1 = class_predictor(5, 10)
cls_pred1.initialize()
y1 = cls_pred1(feat1)
print('Class prediction for feature map 1', y1.shape)
# 下采样
ds = down_sample(16)
ds.initialize()
feat2 = ds(feat1)
print('Feature map 2', feat2.shape)
cls_pred2 = class_predictor(3, 10)
cls_pred2.initialize()
y2 = cls_pred2(feat2)
print('Class prediction for feature map 2', y2.shape)``````

Feature map 1 (2, 8, 20, 20)
Class prediction for feature map 1 (2, 55, 20, 20)
Feature map 2 (2, 16, 10, 10)
Class prediction for feature map 2 (2, 33, 10, 10)

``````def flatten_prediction(pred):
return nd.flatten(nd.transpose(pred, axes=(0, 2, 3, 1)))

def concat_predictions(preds):
return nd.concat(*preds, dim=1)

flat_y1 = flatten_prediction(y1)
print('Flatten class prediction 1', flat_y1.shape)
flat_y2 = flatten_prediction(y2)
print('Flatten class prediction 2', flat_y2.shape)
print('Concat class predictions', concat_predictions([flat_y1, flat_y2]).shape)``````

Flatten class prediction 1 (2, 22000)
Flatten class prediction 2 (2, 3300)
Concat class predictions (2, 25300)

``````from mxnet import gluon
def body():
"""return the body network"""
out = nn.HybridSequential()
for nfilters in [16, 32, 64]:
return out

bnet = body()
bnet.initialize()
x = nd.zeros((2, 3, 256, 256))
print('Body network', [y.shape for y in bnet(x)])``````

Body network [(64, 32, 32), (64, 32, 32)]

``````def toy_ssd_model(num_anchors, num_classes):
"""return SSD modules"""
downsamples = nn.Sequential()
class_preds = nn.Sequential()
box_preds = nn.Sequential()

for scale in range(5):

return body(), downsamples, class_preds, box_preds

print(toy_ssd_model(5, 2))``````

(HybridSequential(
(0): HybridSequential(
(0): Conv2D(16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): BatchNorm(fix_gamma=False, axis=1, momentum=0.9, eps=1e-05, in_channels=16)
(2): Activation(relu)
(3): Conv2D(16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(4): BatchNorm(fix_gamma=False, axis=1, momentum=0.9, eps=1e-05, in_channels=16)
(5): Activation(relu)
(6): MaxPool2D(size=(2, 2), stride=(2, 2), padding=(0, 0), ceil_mode=False)
)
(1): HybridSequential(
(0): Conv2D(32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): BatchNorm(fix_gamma=False, axis=1, momentum=0.9, eps=1e-05, in_channels=32)
(2): Activation(relu)
(3): Conv2D(32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(4): BatchNorm(fix_gamma=False, axis=1, momentum=0.9, eps=1e-05, in_channels=32)
(5): Activation(relu)
(6): MaxPool2D(size=(2, 2), stride=(2, 2), padding=(0, 0), ceil_mode=False)
)
(2): HybridSequential(
(0): Conv2D(64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): BatchNorm(fix_gamma=False, axis=1, momentum=0.9, eps=1e-05, in_channels=64)
(2): Activation(relu)
(3): Conv2D(64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(4): BatchNorm(fix_gamma=False, axis=1, momentum=0.9, eps=1e-05, in_channels=64)
(5): Activation(relu)
(6): MaxPool2D(size=(2, 2), stride=(2, 2), padding=(0, 0), ceil_mode=False)
)
), Sequential(
(0): HybridSequential(
(0): Conv2D(128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): BatchNorm(fix_gamma=False, axis=1, momentum=0.9, eps=1e-05, in_channels=128)
(2): Activation(relu)
(3): Conv2D(128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(4): BatchNorm(fix_gamma=False, axis=1, momentum=0.9, eps=1e-05, in_channels=128)
(5): Activation(relu)
(6): MaxPool2D(size=(2, 2), stride=(2, 2), padding=(0, 0), ceil_mode=False)
)
(1): HybridSequential(
(0): Conv2D(128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): BatchNorm(fix_gamma=False, axis=1, momentum=0.9, eps=1e-05, in_channels=128)
(2): Activation(relu)
(3): Conv2D(128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(4): BatchNorm(fix_gamma=False, axis=1, momentum=0.9, eps=1e-05, in_channels=128)
(5): Activation(relu)
(6): MaxPool2D(size=(2, 2), stride=(2, 2), padding=(0, 0), ceil_mode=False)
)
(2): HybridSequential(
(0): Conv2D(128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): BatchNorm(fix_gamma=False, axis=1, momentum=0.9, eps=1e-05, in_channels=128)
(2): Activation(relu)
(3): Conv2D(128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(4): BatchNorm(fix_gamma=False, axis=1, momentum=0.9, eps=1e-05, in_channels=128)
(5): Activation(relu)
(6): MaxPool2D(size=(2, 2), stride=(2, 2), padding=(0, 0), ceil_mode=False)
)
), Sequential(
(0): Conv2D(15, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): Conv2D(15, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(2): Conv2D(15, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): Conv2D(15, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(4): Conv2D(15, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
), Sequential(
(0): Conv2D(20, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): Conv2D(20, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(2): Conv2D(20, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): Conv2D(20, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(4): Conv2D(20, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
))

``````def toy_ssd_forward(x, body, downsamples, class_preds, box_preds, sizes, ratios):
# 计算主干网络的输出
x = body(x)

# 在每个预测层, 计算预设框，分类概率，偏移量
# 然后在下采样到下一层预测层，重复
default_anchors = []
predicted_boxes = []
predicted_classes = []

for i in range(5):
default_anchors.append(MultiBoxPrior(x, sizes=sizes[i], ratios=ratios[i]))
predicted_boxes.append(flatten_prediction(box_preds[i](x)))
predicted_classes.append(flatten_prediction(class_preds[i](x)))
if i < 3:
x = downsamples[i](x)
elif i == 3:
# 最后一层可以简单地用全局Pooling
x = nd.Pooling(x, global_pool=True, pool_type='max', kernel=(4, 4))

return default_anchors, predicted_classes, predicted_boxes``````

``````from mxnet import gluon
class ToySSD(gluon.Block):
def __init__(self, num_classes, **kwargs):
super(ToySSD, self).__init__(**kwargs)
# 5个预测层，每层负责的预设框尺寸不同，由小到大，符合网络的形状
self.anchor_sizes = [[.2, .272], [.37, .447], [.54, .619], [.71, .79], [.88, .961]]
# 每层的预设框都用 1，2，0.5作为长宽比候选
self.anchor_ratios = [[1, 2, .5]] * 5
self.num_classes = num_classes

with self.name_scope():
self.body, self.downsamples, self.class_preds, self.box_preds = toy_ssd_model(4, num_classes)

def forward(self, x):
default_anchors, predicted_classes, predicted_boxes = toy_ssd_forward(x, self.body, self.downsamples,
self.class_preds, self.box_preds, self.anchor_sizes, self.anchor_ratios)
# 把从每个预测层输入的结果摊平并连接，以确保一一对应
anchors = concat_predictions(default_anchors)
box_preds = concat_predictions(predicted_boxes)
class_preds = concat_predictions(predicted_classes)
# 改变下形状，为了更方便地计算softmax
class_preds = nd.reshape(class_preds, shape=(0, -1, self.num_classes + 1))

return anchors, class_preds, box_preds``````

``````# 新建一个2个正类的SSD网络
net = ToySSD(2)
net.initialize()
x = nd.zeros((1, 3, 256, 256))
default_anchors, class_predictions, box_predictions = net(x)
print('Outputs:', 'anchors', default_anchors.shape, 'class prediction', class_predictions.shape, 'box prediction', box_predictions.shape)``````

Outputs: anchors (1, 5444, 4) class prediction (1, 5444, 3) box prediction (1, 21776)

## 数据集 Dataset

``````from mxnet.test_utils import download
import os.path as osp
def verified(file_path, sha1hash):
import hashlib
sha1 = hashlib.sha1()
with open(file_path, 'rb') as f:
while True:
if not data:
break
sha1.update(data)
matched = sha1.hexdigest() == sha1hash
if not matched:
return matched

url_format = 'https://apache-mxnet.s3-accelerate.amazonaws.com/gluon/datasets/pikachu/{}'
hashes = {'train.rec': 'e6bcb6ffba1ac04ff8a9b1115e650af56ee969c8',
'train.idx': 'dcf7318b2602c06428b9988470c731621716c393',
'val.rec': 'd6c33f799b4d058e82f2cb5bd9a976f69d72d520'}
for k, v in hashes.items():
fname = 'pikachu_' + k
target = osp.join('data', fname)
url = url_format.format(k)
if not osp.exists(target) or not verified(target, v):

``````import mxnet.image as image
data_shape = 256
batch_size = 32
def get_iterators(data_shape, batch_size):
class_names = ['pikachu']
num_class = len(class_names)
train_iter = image.ImageDetIter(
batch_size=batch_size,
data_shape=(3, data_shape, data_shape),
path_imgrec='./data/pikachu_train.rec',
path_imgidx='./data/pikachu_train.idx',
shuffle=True,
mean=True,
rand_crop=1,
min_object_covered=0.95,
max_attempts=200)
val_iter = image.ImageDetIter(
batch_size=batch_size,
data_shape=(3, data_shape, data_shape),
path_imgrec='./data/pikachu_val.rec',
shuffle=False,
mean=True)
return train_iter, val_iter, class_names, num_class

train_data, test_data, class_names, num_class = get_iterators(data_shape, batch_size)
batch = train_data.next()
print(batch)``````

DataBatch: data shapes: [(32, 3, 256, 256)] label shapes: [(32, 1, 5)]

``````import numpy as np

img = batch.data[0][0].asnumpy()  # 取第一批数据中的第一张，转成numpy
img = img.transpose((1, 2, 0))  # 交换下通道的顺序
img += np.array([123, 117, 104])
img = img.astype(np.uint8)  # 图片应该用0-255的范围
# 在图上画出真实标签的方框
for label in batch.label[0][0].asnumpy():
if label[0] < 0:
break
print(label)
xmin, ymin, xmax, ymax = [int(x * data_shape) for x in label[1:5]]
rect = plt.Rectangle((xmin, ymin), xmax - xmin, ymax - ymin, fill=False, edgecolor=(1, 0, 0), linewidth=3)
plt.imshow(img)
plt.show()``````

[ 0. 0.75724518 0.34316057 0.93332517 0.70017999]

## 训练 Train

``````from mxnet.contrib.ndarray import MultiBoxTarget
def training_targets(default_anchors, class_predicts, labels):
class_predicts = nd.transpose(class_predicts, axes=(0, 2, 1))
z = MultiBoxTarget(*[default_anchors, labels, class_predicts])
box_target = z[0]  # 预设框偏移量 (x, y, width, height)
cls_target = z[2]  # 每个预设框应该对应的分类

gluon.loss中有很多预设的损失函数可以选择，当然我们也可以快速地手写一些损失函数。

``````class FocalLoss(gluon.loss.Loss):
def __init__(self, axis=-1, alpha=0.25, gamma=2, batch_axis=0, **kwargs):
super(FocalLoss, self).__init__(None, batch_axis, **kwargs)
self._axis = axis
self._alpha = alpha
self._gamma = gamma

def hybrid_forward(self, F, output, label):
output = F.softmax(output)
pt = F.pick(output, label, axis=self._axis, keepdims=True)
loss = -self._alpha * ((1 - pt) ** self._gamma) * F.log(pt)
return F.mean(loss, axis=self._batch_axis, exclude=True)

# cls_loss = gluon.loss.SoftmaxCrossEntropyLoss()
cls_loss = FocalLoss()
print(cls_loss)``````

FocalLoss(batch_axis=0, w=None)

``````class SmoothL1Loss(gluon.loss.Loss):
def __init__(self, batch_axis=0, **kwargs):
super(SmoothL1Loss, self).__init__(None, batch_axis, **kwargs)

def hybrid_forward(self, F, output, label, mask):
loss = F.smooth_l1((output - label) * mask, scalar=1.0)
return F.mean(loss, self._batch_axis, exclude=True)

box_loss = SmoothL1Loss()
print(box_loss)``````

SmoothL1Loss(batch_axis=0, w=None)

``````cls_metric = mx.metric.Accuracy()
box_metric = mx.metric.MAE()  # measure absolute difference between prediction and target``````

``````ctx = mx.gpu()  # 用GPU加速训练过程
try:
_ = nd.zeros(1, ctx=ctx)
# 为了更有效率，cuda实现需要少量的填充，不影响结果
train_data.reshape(label_shape=(3, 5))
train_data = test_data.sync_label_shape(train_data)
except mx.base.MXNetError as err:
# 没有gpu也没关系，交给cpu慢慢跑
print('No GPU enabled, fall back to CPU, sit back and be patient...')
ctx = mx.cpu()``````

``````net = ToySSD(num_class)
net.initialize(mx.init.Xavier(magnitude=2), ctx=ctx)``````

gluon.Trainer能简化优化网络参数的过程，免去对各个参数单独更新的痛苦。

``````net.collect_params().reset_ctx(ctx)
trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': 0.1, 'wd': 5e-4})``````

``from_scratch = True``

``````epochs = 150  # 设大一点的值来得到更好的结果
log_interval = 20
from_scratch = False  # 设为True就可以从头开始训练
if from_scratch:
start_epoch = 0
else:
start_epoch = 148
pretrained = 'ssd_pretrained.params'
sha1 = 'fbb7d872d76355fff1790d864c2238decdb452bc'
url = 'https://apache-mxnet.s3-accelerate.amazonaws.com/gluon/models/ssd_pikachu-fbb7d872.params'
if not osp.exists(pretrained) or not verified(pretrained, sha1):

``````import time
from mxnet import autograd as ag
for epoch in range(start_epoch, epochs):
# 重置iterator和时间戳
train_data.reset()
cls_metric.reset()
box_metric.reset()
tic = time.time()
# 迭代每一个批次
for i, batch in enumerate(train_data):
btic = time.time()
with ag.record():
x = batch.data[0].as_in_context(ctx)
y = batch.label[0].as_in_context(ctx)
default_anchors, class_predictions, box_predictions = net(x)
box_target, box_mask, cls_target = training_targets(default_anchors, class_predictions, y)
# 损失函数计算
loss1 = cls_loss(class_predictions, cls_target)
# 1比1叠加两个损失函数，也可以加权重
loss = loss1 + loss2
# 反向推导
loss.backward()
# 用trainer更新网络参数
trainer.step(batch_size)
# 更新下衡量的指标
cls_metric.update([cls_target], [nd.transpose(class_predictions, (0, 2, 1))])
if (i + 1) % log_interval == 0:
name1, val1 = cls_metric.get()
name2, val2 = box_metric.get()
print('[Epoch %d Batch %d] speed: %f samples/s, training: %s=%f, %s=%f'
%(epoch ,i, batch_size/(time.time()-btic), name1, val1, name2, val2))

# 打印整个epoch的的指标
name1, val1 = cls_metric.get()
name2, val2 = box_metric.get()
print('[Epoch %d] training: %s=%f, %s=%f'%(epoch, name1, val1, name2, val2))
print('[Epoch %d] time cost: %f'%(epoch, time.time()-tic))

# 还可以把网络的参数存下来以便下次再用
net.save_params('ssd_%d.params' % epochs)``````

[Epoch 148 Batch 19] speed: 109.217423 samples/s, training: accuracy=0.997539, mae=0.001862
[Epoch 148] training: accuracy=0.997610, mae=0.001806
[Epoch 148] time cost: 17.762958
[Epoch 149 Batch 19] speed: 110.492729 samples/s, training: accuracy=0.997607, mae=0.001824
[Epoch 149] training: accuracy=0.997692, mae=0.001789
[Epoch 149] time cost: 15.353258

## 测试 Test

``````import numpy as np
import cv2
def preprocess(image):
"""Takes an image and apply preprocess"""
# 调整图片大小成网络的输入
image = cv2.resize(image, (data_shape, data_shape))
# 转换 BGR 到 RGB
image = image[:, :, (2, 1, 0)]
# 减mean之前先转成float
image = image.astype(np.float32)
# 减 mean
image -= np.array([123, 117, 104])
# 调成为 [batch-channel-height-width]
image = np.transpose(image, (2, 0, 1))
image = image[np.newaxis, :]
# 转成 ndarray
image = nd.array(image)
return image

x = preprocess(image)
print('x', x.shape)``````

x (1, 3, 256, 256)

``````# 如果有预先训练好的网络参数，可以直接加载
anchors, cls_preds, box_preds = net(x.as_in_context(ctx))
print('anchors', anchors)
print('class predictions', cls_preds)
print('box delta predictions', box_preds)``````

anchors
[[[-0.084375 -0.084375 0.115625 0.115625 ]
[-0.12037501 -0.12037501 0.15162501 0.15162501]
[-0.12579636 -0.05508568 0.15704636 0.08633568]
...,
[ 0.01949999 0.01949999 0.98049998 0.98049998]
[-0.12225395 0.18887302 1.12225389 0.81112695]
[ 0.18887302 -0.12225395 0.81112695 1.12225389]]]
<NDArray 1x5444x4 @gpu(0)>
class predictions
[[[ 0.33754104 -1.64660323]
[ 1.15297699 -1.77257478]
[ 1.1535604 -0.98352218]
...,
[-0.27562004 -1.29400492]
[ 0.45524898 -0.88782215]
[ 0.20327765 -0.94481993]]]
<NDArray 1x5444x2 @gpu(0)>
box delta predictions
[[-0.16735925 -0.13083346 -0.68860865 ..., -0.18972112 0.11822788
-0.27067867]]
<NDArray 1x21776 @gpu(0)>

``````from mxnet.contrib.ndarray import MultiBoxDetection
# 跑一下softmax， 转成0-1的概率
cls_probs = nd.SoftmaxActivation(nd.transpose(cls_preds, (0, 2, 1)), mode='channel')
# 把偏移量加到预设框上，去掉得分很低的，跑一遍nms，得到最终的结果
output = MultiBoxDetection(*[cls_probs, box_preds, anchors], force_suppress=True, clip=False)
print(output)``````

[[[ 0. 0.61178613 0.51807499 0.5042429 0.67325425 0.70118797]
[-1. 0.59466797 0.52491206 0.50917625 0.66228026 0.70489514]
[-1. 0.5731774 0.53843218 0.50217044 0.66522425 0.7118448 ]
...,
[-1. -1. -1. -1. -1. -1. ]
[-1. -1. -1. -1. -1. -1. ]
[-1. -1. -1. -1. -1. -1. ]]]
<NDArray 1x5444x6 @gpu(0)>

``````def display(img, out, thresh=0.5):
import random
import matplotlib as mpl
mpl.rcParams['figure.figsize'] = (10,10)
pens = dict()
plt.clf()
plt.imshow(img)
for det in out:
cid = int(det[0])
if cid < 0:
continue
score = det[1]
if score < thresh:
continue
if cid not in pens:
pens[cid] = (random.random(), random.random(), random.random())
scales = [img.shape[1], img.shape[0]] * 2
xmin, ymin, xmax, ymax = [int(p * s) for p, s in zip(det[2:6].tolist(), scales)]
rect = plt.Rectangle((xmin, ymin), xmax - xmin, ymax - ymin, fill=False,
edgecolor=pens[cid], linewidth=3)
text = class_names[cid]
plt.gca().text(xmin, ymin-2, '{:s} {:.3f}'.format(text, score),
bbox=dict(facecolor=pens[cid], alpha=0.5),
fontsize=12, color='white')
plt.show()

display(image[:, :, (2, 1, 0)], output[0].asnumpy(), thresh=0.45)``````

## 相关链接

Apache MXNet官方网站：https://mxnet.incubator.apache.org/

Github Repo: zackchase/mxnet-the-straight-dope

Eric知乎介绍0.11 新特性：https://zhuanlan.zhihu.com/p/28648399

0.11 Release：https://github.com/apache/incubator-mxnet/releases

## 皮卡丘检测器-CNN目标检测入门教程的更多相关文章

1. CNN目标检测系列算法发展脉络——学习笔记（一）：AlexNet

在咨询了老师的建议后,最近开始着手深入的学习一下目标检测算法,结合这两天所查到的资料和个人的理解,准备大致将CNN目标检测的发展脉络理一理(暂时只讲CNN系列部分,YOLO和SSD,后面会抽空整理). ...

2. [转]CNN目标检测（一）：Faster RCNN详解

https://blog.csdn.net/a8039974/article/details/77592389 Faster RCNN github : https://github.com/rbgi ...

3. TensorFlow + Keras 实战 YOLO v3 目标检测图文并茂教程

运行步骤 1.从 YOLO 官网下载 YOLOv3 权重 wget https://pjreddie.com/media/files/yolov3.weights 下载过程如图: 2.转换 Darkn ...

4. 标题 发布状态 评论数 阅读数 操作 操作 CNN目标检测系列算法发展脉络简析——学习笔记（三）：Fast R-CNN

最近两周忙着上网课.投简历,博客没什么时间写,姑且把之前做的笔记放上来把... 下面是我之前看论文时记的笔记,之间copy上来了,内容是Fast R-CNN的,以后如果抽不出时间写博客,就放笔记上来( ...

5. CNN之yolo目标检测算法笔记

本文并不是详细介绍yolo工作原理以及改进发展的文章,只用做作者本人回想与提纲. 1.yolo是什么 输入一张图片,输出图片中检测到的目标和位置(目标的边框) yolo名字含义:you only lo ...

6. YOLO_Online 将深度学习最火的目标检测做成在线服务实战经验分享

YOLO_Online 将深度学习最火的目标检测做成在线服务 第一次接触 YOLO 这个目标检测项目的时候,我就在想,怎么样能够封装一下让普通人也能够体验深度学习最火的目标检测项目,不需要关注技术细节 ...

7. 目标检测（七）YOLOv3: An Incremental Improvement

项目地址 Abstract 该技术报告主要介绍了作者对 YOLOv1 的一系列改进措施(注意:不是对YOLOv2,但是借鉴了YOLOv2中的部分改进措施).虽然改进后的网络较YOLOv1大一些,但是检 ...

8. 【神经网络与深度学习】【计算机视觉】RCNN- 将CNN引入目标检测的开山之作

转自:https://zhuanlan.zhihu.com/p/23006190?refer=xiaoleimlnote 前面一直在写传统机器学习.从本篇开始写一写 深度学习的内容. 可能需要一定的神 ...

9. RCNN (Regions with CNN) 目标物检测 Fast RCNN的基础

Abstract: 贡献主要有两点1:可以将卷积神经网络应用region proposal的策略,自底下上训练可以用来定位目标物和图像分割 2:当标注数据是比较稀疏的时候,在有监督的数据集上训练之后到 ...

## 随机推荐

2. C# CodeFirst(EF框架)代码优先创建数据库

namespace WebEF.Model{ public class ModelContext:DbContext //继承DBcontext 来自EF框架 { public ModelContex ...

3. Xamarin.Android之UI Test简单入门

一.前言 相信Xamarin免费之后会有更多的人加入进来,这也是我一直以来最希望看到的事,更多的人加入到这个社区中,为这个社区贡献自己的一份力量,国内当前还没有一个比较正规或者说是名气比较大的Xama ...

4. ZeroclipboardJS+flash实现将内容复制到剪贴板实例

Zeroclipboard 的实现原理 Zeroclipboard 利用 Flash 进行复制,之前有 Clipboard Copy 解决方案,其利用的是一个隐藏的 Flash.但最新的 Flash ...

5. Android’s HTTP Clients (httpClient 和 httpURLConnect 区别)

来源自:http://android-developers.blogspot.jp/2011/09/androids-http-clients.html Most network-connected ...

6. appium+Python 启动app（一）

当我们appium和Python环境都配置好了,如何启动我们第一个app呢?下面介绍appium+Python启动app的操作步骤,为了能够详细查看,我们这里使用夜游神模拟器进行示范. 测试项目:QQ ...

7. win7远程桌面 连接不上（用户名与全名不匹配的问题）

用户名与用户全名不一致导致的.我刚也是这个问题,折腾够了好久.你先看看 计算机右键→管理→本地用户和组→用户 找到你需要远程的管理员账户,看看名称与全名是否一致,若不一致,继续看下面.1.按" ...

8. 格式化输出prettify()

BeautifulSoup的格式化输出函数: print(soup.prettify())

9. Android的加速度传感器模拟摇一摇的效果-android学习之旅(66)

主要介绍一下android的加速传感器的简单用法,模拟摇一摇 ,如果x,y,z三个方向的加速度超过了15,就会弹出Toast,当然你可以设置更复杂的策略,比如判断间隔 代码如下 public clas ...