This section is heavily based on the contents of @hartley2003multiple.
The projections of a 3D point $\mathbf{x}$ over images are always related by the fundamental matrix $\mathbf{F}$.
Adding intrinsic parameters with $\mathbf{K}$ to the fundamental matrix gives a metric "object" such that the image points can be related by the essential matrix $\mathbf{E}$.
This matrix links the relative position of the cameras to the fundamental matrix. Camera poses can be recovered using SVD decomposition.
where $\mathbf{U}, \mathbf{V}, \mathbf{\Sigma} \in \mathbb{R}_{3 \times 3}$. The first two matrices are orthogonal while the last one is a diagonal matrix with the diagonal entries as the singular values of $\mathbf{E}$ which, according to the internal constraints of the essential matrix, must consist of two identical and one zero value.
By defining a helper matrix $\mathbf{W}$ as
the rotation matrix $\mathbf{R}$ and translation vector $[\mathbf{t}]_{\times}$ can be computed as
In some cases an alternative formulation of the translation vector can be used if $\mathbf{\Sigma}$ does not completely fulfill the constraints of real world data:
If the scene is planar ($\mathbf{T} = 0$) the points are related by a simple homography $H$ such that $\mathbf{x}'=\mathbf{H}\mathbf{x}$
The triangulation problem consists of computing the 3D point $\mathbf{X}$ from two point $\mathbf{x}, \mathbf{x}'$ and the camera projection $\mathbf{P}, \mathbf{P}'$ from two different views. The problem is that in the presence of noise back projected rays do not intersect. There are different solutions according to @hartley1997triangulation
Even when minimization appears to be over the three parameters of $\mathbf{X}$, it can be reduced to just a single parameter.
Triangulation can be applied from two to $n$ views so that there are more equations to overdetermine the linear system. In such case, the Direct Linear Transform (DLT) algorithm can be applied:
In order to guaranteed method stability since DLT is not invariant, it is recommended to normalize coordinates in both images independently before applying DLT and denormalize after.
The resulting matrix to apply will be: