深度学习&卷积神经网络学习笔记(三)应用

2018年6月17日 0 作者 Manchery

Object detection

  • classification with localization: mainly 1 object,output bounding box
  • object detection: multi-object
  • $y=\left[\begin{matrix}
    p_c \
    b_x \
    b_y \
    b_h \
    b_w \
    c_1 \
    c_2 \
    c_3
    \end{matrix} \right]$
  • landmark detection

Sliding windows

  • ConvNet + sliding windows
  • Conv implementation of sliding windows
    • FC $\rightarrow$ CONV
    • conv implementation
    • 层之间 stride 要相互适应下

YOLO

  • Bounding Box Prediction: YOLO algorithm
  • evaluating : intersection over union (lou) $\geq 0.5$
  • non-max suppression
  • anchor-boxes: tiebreaking

Region proposal

  • R-CNN
  • fast R-CNN
  • faster R-CNN

Face Regnition

  • face verification vs. reginition
  • one-shot learning: learning a “similarity” function, $d(img_1,img_2)\leq \tau$
  • siamese network 孪生网络

Triplet

  • triplet loss: A, P, N
  • $d(A,P)-d(A,N)+\alpha\leq 0$
  • $$L(A,P,N)=max(0,||f(A)-f(P)||^2-||f(A)-f(N)||^2+\alpha)$$
  • $J=\sum L(A^{(i)},P^{(i)},N^{(i)})$
  • choose triplet: not randomly/ that “hard” to train on

Binary Classification

  • $y=\sum_k w_k | f(x^{(i)})_k – f(x^{(j)})_k|+b_k$

Neural style transfer

  • Content + Style $\rightarrow$ Generated Image
  • Cost function $J(G)=\alpha J_{content}(C,G)+\beta J_{style}(S,G)$
  • $$J_{content}(C,G)={1\over 2}||a^{l}-a^{l}||^2$$ ($l\rightarrow$ hpara?)
  • $$G_{k,k’}^{l}=\sum_{i=1}^{n_H} \sum_{j=1}^{n_W} a_{i,j,k}^{[l](S}a_{i,j,k’}^{[l](S}$$ (style matrix $n_c^{[l]}\times n_c^{[l]}$)
  • $$J_{style}^{[l]}(S,G)={1\over (2n_H^{[l]}n_W^{[l]}n_C^{[l]})^2}\sum_k\sum_{k’}||G^{l}-G^{l}||^2$$
  • $J_{style}(S,G)=\sum_l \lambda^{[l]} J_{style}^{[l]}(S,G)$

1D and 3D Generalization

其他

Visualizing and understanding ConvNet

Softmax Regression

  • $$a^{[l]}={e^{z^{[l]}}\over \sum e^{z^{[l]}}}$$
  • $L(y,{\hat{y}} )=-\sum y_j\log {\hat{y_j}}$
  • $J={1\over m}\sum L(y^i,{\hat{y^i}} )$

练习

Keras+-+Tutorial+-+Happy+House+v2