目标追踪小任务（基于SIFT，LK光流，ceres）

一、流程简述

这篇博客实现了在视频中追踪一个矩形物体并且用目标样板将其替换。首先先使用sift来检测特征点，用暴力匹配来得到两相邻帧的匹配，再使用LK光流追踪来获得相邻两帧的匹配，接着根据欧氏距离取两帧匹配点对的交集，再通过获得的点对来计算单应性矩阵（使用RANSAC方法），最后将获得的单应性矩阵作为初始值，用ceres进行非线性优化，从而得到较好的单应性矩阵。得到单应性矩阵后计算前一帧目标四点在后一帧的位置，从而进行贴图操作。

二、步骤解读

1、SIFT特征点检测

1.1SIFT原理简介

SIFT(Scale Invariant Feature Transform)，即尺度不变特征变换。具有较好的稳定性和不变性，能够适应旋转、尺度缩放、亮度的变化，能在一定程度上不受视角变化、仿射变换、噪声的干扰。
Sift特征点检测主要包括以下几个步骤：
（1）尺度空间极值点检测
（2）特征点的定位
（3）为特征点赋予方向信息
（4）生成描述子

尺度空间极值点检测

尺度空间一般是由图像金字塔构建的，对一幅图像进行降采样，在不同分辨率下提取图像的关键点，Sift关键点是通过高斯函数对图像进行不同程度的模糊处理得到的高斯金字塔。通过图像金字塔提取的关键点具有尺度不变性。
高斯差分金字塔是由高斯金字塔相邻两层进行相减得到的，极值点检测只在同尺度的每一组图像中的内层图像中进行（即差分金字塔中每层有7张图像，只检测中间五张）。
在这里插入图片描述

在这里插入图片描述
极值点检测如下图，中间的检测点和它同尺度的8个相邻点和上下相邻尺度对应的9×2个点共26个点比较，以确保在尺度空间和二维图像空间都检测到极值点。

在这里插入图片描述

特征点的定位

对检测到的极值点进行非极大值抑制，即拟合三维二次函数来精确确定关键点的位置和尺度（达到亚像素精度）,如下图：
在这里插入图片描述
为了去除不稳定的边缘响应点，通过计算主曲率来剔除，一个定义不好的高斯差分算子的极值在横跨边缘的地方有较大的主曲率，而在垂直边缘的方向有较小的主曲率。主曲率通过2x2的Hessian矩阵求出。

为特征点赋予方向信息

利用关键点邻域像素的梯度方向分布特性为每个关键点指定方向参数，使算子具备旋转不变性。使用图像梯度的方法求取局部结构的稳定方向。对于在DOG金字塔中检测出的关键点点，采集其所在高斯金字塔图像3σ邻域窗口内像素的梯度和方向分布特征。梯度的模值和方向如下：
在这里插入图片描述

梯度直方图将0~360度的方向范围分为36个柱，其中每柱10度。如下图所示，直方图的峰值方向代表了关键点的主方向，方向直方图的峰值则代表了该特征点处邻域梯度的方向，以直方图中最大值作为该关键点的主方向。为了增强匹配的鲁棒性，只保留峰值大于主方向峰值80％的方向作为该关键点的辅方向。为了消除噪声点，使用高斯函数加权累加。
在这里插入图片描述

生成描述子

生成描述子，用于判断不同图像上的特征点是否为同一个点。在这里插入图片描述计算关键点周围的1616的窗口中每一个像素的梯度，且使用高斯函数降低远离中心的权重。这样就可以对每个特征点形成一个44*8=128维的描述子。如下图所示。

在这里插入图片描述

2、特征点匹配

2.1 暴力匹配

直接将相邻两张图片的所有特征点进行暴力匹配，将第一张图片的每个特征点的描述子与第二张的每一个特征点的描述子求二范数，二范数小的即视为匹配对。
但特征点很多，这样匹配出来的结果很多是弱匹配。设二范数为D,通过条件D<max(3*min_D,20)来剔除弱匹配点。

2.2 LK光流追踪

光流法的三个假设：
（1）亮度恒定：假设像素的亮度不会因为帧的跟踪改变。
（2）时间持续性或“微小移动”：图像上点的运动随时间变化缓慢，即目标两帧之间的运动幅度不会太大。
（3）空间一致性：场景中属于相同表面的相邻点，具有相似的运动，并且其投影到图像平面的点距离也比较近。
根据之前筛选的特征点，用光流法求出下一帧的匹配点。

2.3 强匹配点筛选

根据之前暴力匹配出的下一帧特征点，与光流法求出的下一帧特征点，通过求两者的欧氏距离，设定阈值，小于此阈值的即为强匹配点。

3、求单应性矩阵

3.1根据强匹配点对求单应性矩阵

使用findHomography()函数
在这里插入图片描述

第一个参数是第一张图片特征点，第二个参数是第二张图片匹配点，第三个参数是使用的方法，本文使用的是RANSAC方法，则第四个参数是RANSAC方法的阈值，第四个参数是mask，第五个参数是迭代次数，第六个参数是置信度。

RANSAC方法优化单应性矩阵：随机取四个点对可以求出一个单应性矩阵H，然后用H分别第一张图片特征点的坐标x1得到_x2，看看_x2中的点有多少符合阈值，并且输出达到置信度要求的H。

3.2 使用Ceres对单应性矩阵进行非线性优化

把之前求得的单应性矩阵H作为初始点，然后输入之前的点对使用Ceres框架进行非线性优化，最小目标函数，从而得到较好的单应性矩阵。

三、代码解读

代码如下，注释已经很详细。

#include <opencv2/opencv.hpp>
#include<iostream>
#include <ceres/ceres.h>
#include <Eigen/Core>
#include <Eigen/Dense>
#include<opencv2/xfeatures2d.hpp>
#include <glog/logging.h>
#include<opencv2/core/eigen.hpp>
using namespace cv;
using namespace std;
using namespace cv::xfeatures2d;
using namespace Eigen;Mat logoimg = imread("直方图对比1.jpg");
Mat frame, gray, prev_frame, mask;
Point point_mouse;
vector<Point2f> logo_4points;//logo图片的四个角点
vector<Point2f> frame_4points;
vector<uchar> status;
vector<float> errors;
Mat homo_H;//单应性矩阵
//sift特征点检测变量
vector<KeyPoint> keyPoints_1, keyPoints_2;
Mat descriptors_1, descriptors_2;
vector<Point2f> fpts[2];
//ceres优化单应性矩阵中全局变量的说明
typedef Eigen::Matrix<double ,3,3> E_Mat33;
typedef Eigen::Matrix<double ,2,1> E_Mat21;
typedef Eigen::MatrixXd E_Matxx;
typedef Eigen::VectorXd E_Vec1X;//设置优化迭代次数和误差终止值
struct EstimateHomographyOptions{EstimateHomographyOptions():max_num_iterations(50),expected_average_symmetric_distance(1e-16){}int max_num_iterations;double expected_average_symmetric_distance;
};
//计算代价函数：四个分量
template <typename T>
void SymmetricGeometricDistanceTerms(const Eigen::Matrix<T,3,3> &H,//单应性矩阵变量const Eigen::Matrix<T,2,1> &x1,//前一帧匹配点坐标变量const Eigen::Matrix<T,2,1> &x2,//后一帧匹配点坐标变量T &residual                    //代价函数变量
){typedef Eigen::Matrix<T,3,1> E_Mat31_T;//定义齐次坐标类型T forward_error[2];E_Mat31_T x(x1(0),x1(1),T(1.0));//将前一帧匹配点像素坐标转换为齐次坐标E_Mat31_T y(x2(0),x2(1),T(1.0));//将后一帧匹配点像素坐标转换为齐次坐标E_Mat31_T H_x = H*x;            //计算前一帧匹配点在当前帧中的重投影坐标H_x/=H_x(2);                    //归一化重投影坐标forward_error[0]=H_x(0)-y(0);   //计算重投影的x方向误差forward_error[1]=H_x(1)-y(1);   //计算重投影的y方向误差
//    cout<<"backward_error:"<<forward_error[0]<<"\t"<<forward_error[1]<<endl;
//    waitKey(1);residual=sqrt(forward_error[0]*forward_error[0]+forward_error[1]*forward_error[1]);//计算重投影误差的二范数作为代价函数误差
};
//构建代价函数结构
struct HomographySymmetricGeometricCostFunctor{HomographySymmetricGeometricCostFunctor(const E_Mat21 &x,const E_Mat21 &y):_x(x),_y(y){}//x,y是传入代价函数的每对匹配点//重载()运算符并求代价函数template <typename T>//homography_parameters是优化变量，residuals是代价函数误差值，均为一维向量bool operator()(const T*homography_parameters,T*residuals)const{typedef Eigen::Matrix<T,3,3> E_Mat33_T;//Eigen下的3x3矩阵typedef Eigen::Matrix<T,2,1> E_Mat21_T;//Eigen下的2x1矩阵E_Mat33_T cH(homography_parameters);//将data指针的向量转为矩阵类型E_Mat21_T cx(T(_x(0)),T(_x(1)));//将Mat类型的前一帧匹配点坐标转化成矩阵类型E_Mat21_T cy(T(_y(0)),T(_y(1)));//将Mat类型的后一帧匹配点坐标转化成矩阵类型SymmetricGeometricDistanceTerms<T>(cH,cx,cy,residuals[0]);//调用代价函数求解公式return true;}const E_Mat21 _x;const E_Mat21 _y;
};//ｃｅｒｅｓ进行优化bool EstimateHomography2DFromCorrespondences(const E_Matxx &x1,//前一帧匹配点集const E_Matxx &x2,//当前帧匹配点集const EstimateHomographyOptions &options,E_Mat33 &H        //单应性矩阵
){assert(2==x1.rows());assert(4<=x1.cols());assert(x1.rows()==x2.rows());assert(x1.cols()==x2.cols());//初始化Ｈcv::cv2eigen(homo_H,H);//opencv中的Mat转换为eigen中的Matrixcout<<"初始化矩阵："<<H<<endl;ceres::Problem problem;//构建最小二乘问题//在构建的问题中添加匹配点，核函数，单应性矩阵for (int i = 0; i <x1.cols() ; ++i) {HomographySymmetricGeometricCostFunctor *homography_symmetric_geometric_cost_functor=new HomographySymmetricGeometricCostFunctor(x1.col(i),x2.col(i));problem.AddResidualBlock(//ceres自动求导求增量方程<代价函数类型，代价函数维度，优化变量维度>(代价函数)new ceres::AutoDiffCostFunction<HomographySymmetricGeometricCostFunctor,1,9>(homography_symmetric_geometric_cost_functor),new ceres::CauchyLoss(1),//添加柯西核函数(减小异常点对求解的影响）H.data()//添加的优化变量（单应性矩阵），矩阵转化为了一维向量);}//优化求解ceres::Solver::Options solver_options;//实例化求解器对象solver_options.linear_solver_type=ceres::DENSE_QR;solver_options.minimizer_progress_to_stdout= true;ceres::Solver::Summary summary;//实例化求解对象ceres::Solve(solver_options,&problem,&summary);//求解cout<<"summary:\n"<<summary.BriefReport();return summary.IsSolutionUsable();
}
void getlogo_4points() {Size logosize = logoimg.size();logo_4points.push_back(Point2f(0, 0));logo_4points.push_back(Point2f(logosize.width - 1, 0));logo_4points.push_back(Point2f(logosize.width - 1, logosize.height - 1));logo_4points.push_back(Point2f(0, logosize.height - 1));
}void on_Left_Mouse(int event, int x, int y, int flags, void *param) {Mat &image = *(Mat *) param;switch (event) {case EVENT_LBUTTONDOWN: {point_mouse.x = x;point_mouse.y = y;frame_4points.push_back(point_mouse);
//            circle(image,point_mouse,5,Scalar(0,0,255),-1,8);}break;}
}void detectFeature(Mat &_gray) {Ptr<SIFT> sift = xfeatures2d::SIFT::create();sift->detectAndCompute(_gray, mask, keyPoints_1, descriptors_1);vector<Point2f> feature_1;Mat temp;_gray.copyTo(temp);for (int i = 0; i < keyPoints_1.size(); i++) {feature_1.push_back(keyPoints_1[i].pt);}fpts[0] = feature_1;cout << "图片特征点个数" << fpts[0].size() << endl;
}void LKAndSift() {//光流追踪calcOpticalFlowPyrLK(prev_frame, gray, fpts[0], fpts[1], status, errors);//sift特征点Ptr<SIFT> sift = xfeatures2d::SIFT::create();sift->detectAndCompute(gray, mask, keyPoints_2, descriptors_2);BFMatcher matcher(NORM_L2);vector<DMatch> matches;matcher.match(descriptors_1, descriptors_2, matches);double min_dist = 10000, max_dist = 0;for (int i = 0; i < descriptors_1.rows; i++) {double dist = matches[i].distance;if (dist < min_dist)min_dist = dist;if (dist > max_dist)max_dist = dist;}cout << "min:" << min_dist << "\t" << "max:" << max_dist << endl;//存下符合条件的匹配结果（即其距离小于3*min_dist的）,需要加一个经验值作为下限以防止距离过小vector<DMatch> good_matches;for (int i = 0; i < descriptors_1.rows; i++) {if (matches[i].distance < max(4 * min_dist, 60.0)) {good_matches.push_back(matches[i]);}}cout << "good matches num:" << good_matches.size() << endl;//求光流追踪特征点与sift特征点的交集int k = 0;for (int i = 0; i < good_matches.size(); i++) {int sift_index_1 = good_matches[i].queryIdx;int sift_index_2 = good_matches[i].trainIdx;float a = fpts[1][sift_index_1].x - keyPoints_2[sift_index_2].pt.x;float b = fpts[1][sift_index_1].y - keyPoints_2[sift_index_2].pt.y;float a_b = abs(a) + abs(b);cout<<"a_b="<<a_b<<endl;if ((a_b) < 0.5 && status[sift_index_1]) {fpts[0][k] = fpts[0][sift_index_1];fpts[1][k++] = fpts[1][sift_index_1];}}fpts[0].resize(k);fpts[1].resize(k);cout << fpts[0].size() << endl;cout << fpts[1].size() << endl;
}int main(int argc, char **argv) {VideoCapture vi;vi.open("rili.mp4");if (!vi.isOpened()) {cout << "could not load video file..." << endl;return -1;}
//    double rate = vi.get(CAP_PROP_FPS);//获取帧率VideoWriter vo;vo.open("rili1output.mp4",VideoWriter::fourcc('M', 'J', 'P', 'G'),//四个字符代码指示编解码(double) vi.get(CAP_PROP_FPS),Size((int) vi.get(CAP_PROP_FRAME_WIDTH), (int) vi.get(CAP_PROP_FRAME_HEIGHT)),true);namedWindow("video_input");namedWindow("video_output");getlogo_4points();//获取logo四个角点存入logo_4pointsint n = 0;while (vi.read(frame)) {n++;cvtColor(frame, gray, COLOR_BGR2GRAY);if (prev_frame.empty()) {gray.copyTo(prev_frame);}if (n == 1) {Mat frame_one;frame.copyTo(frame_one);setMouseCallback("video_input", on_Left_Mouse, &frame_one);Mat tempImage;frame.copyTo(tempImage);while (true) {circle(tempImage, point_mouse, 5, Scalar(0, 0, 255), -1, 8);imshow("video_input", tempImage);if (waitKey(10) == 27)break;}mask = Mat::zeros(gray.size(), gray.type());vector<Point> points_mask;for (int i = 0; i < 4; ++i) {points_mask.push_back(frame_4points[i]);}fillConvexPoly(mask, points_mask, Scalar(255), LINE_AA);//填充特征点检测的掩膜为1detectFeature(gray);//贴图Mat warp_Homo = getPerspectiveTransform(logo_4points, frame_4points);//计算透视变换矩阵Mat frame_clone, frame_fill;frame.copyTo(frame_clone);frame.copyTo(frame_fill);warpPerspective(logoimg, frame_clone, warp_Homo, frame_clone.size());//进行透视变换fillConvexPoly(frame_fill, points_mask, Scalar(0, 0, 0), LINE_AA);//将第一帧图像掩膜部分设为0frame_fill = frame_fill + frame_clone;vo << frame_fill;}if (n >= 2) {
//            imshow("video_output", frame);LKAndSift();homo_H = findHomography(fpts[0], fpts[1], RANSAC,3, noArray(), 2000, 0.995);//第四个参数为阈值，用来判断点是内点还是外点；//第五个参数为最大迭代次数//第六个参数为置信度//ceres优化单应性矩阵//1\数据转换int num = fpts[0].size();E_Matxx x1(2, num);//前一帧匹配点集E_Matxx x2(2, num);//当前帧匹配点集//将相邻两帧匹配点存入for (int i = 0; i < num; ++i) {x1(0, i) = static_cast<double>(fpts[0][i].x);x1(1, i) = static_cast<double>(fpts[0][i].y);x2(0, i) = static_cast<double>(fpts[1][i].x);x2(1, i) = static_cast<double>(fpts[1][i].y);}cout<<x1.cols()<<endl;E_Mat33 estimated_matrix;EstimateHomographyOptions options;options.expected_average_symmetric_distance = 0.02;EstimateHomography2DFromCorrespondences(x1, x2, options, estimated_matrix);// Normalize the matrix for easier comparison.estimated_matrix /= estimated_matrix(2, 2);cout<<endl;cout << "homo_H:" << homo_H << endl;std::cout << "Estimated matrix:\n" << estimated_matrix << "\n";cv::eigen2cv(estimated_matrix,homo_H);cout<<"H:"<<homo_H<<endl;//贴图vector<Point2f> tran_4points;perspectiveTransform(frame_4points, tran_4points, homo_H);Mat warp_Homo = getPerspectiveTransform(logo_4points, tran_4points);//计算透视变换矩阵Mat frame_clone, frame_fill;frame.copyTo(frame_clone);frame.copyTo(frame_fill);warpPerspective(logoimg, frame_clone, warp_Homo, frame_clone.size());//进行透视变换vector<Point> points_mask;for (int i = 0; i < 4; ++i) {points_mask.push_back(tran_4points[i]);}fillConvexPoly(frame_fill, points_mask, Scalar(0, 0, 0), LINE_AA);imshow("mask",mask);frame_fill = frame_fill + frame_clone;vo << frame_fill;imshow("sss",frame_fill);//将原来的第二张图片设置成第一张mask = Mat::zeros(gray.size(), gray.type());fillConvexPoly(mask, points_mask, Scalar(255), LINE_AA);gray.copyTo(prev_frame);detectFeature(prev_frame);
//            cout<<"n="<<n;for (int i = 0; i < 4; i++) {frame_4points[i] = tran_4points[i];}}char c = waitKey(1);if (c == 27)break;}waitKey();return 0;
}