We consider the problem of recovering a complete (i.e., square and invertible) matrix $ A_{0}$ , from $ Y \in \mathbb R ^{n \times p}$ with $ Y = A_{0} X_{0}$ , provided $ X_{0}$ is sufficiently sparse. This recovery problem is central to theoretical understanding of dictionary learning, which seeks a sparse representation for a collection of input signals and finds numerous applications in modern signal processing and machine learning. We give the first efficient algorithm that provably recovers $ A_{0}$ when $ X_{0}$ has $O \left ({ n }\right )$ nonzeros per column, under suitable probability model for $ X_{0}$ . In contrast, prior results based on efficient algorithms either only guarantee recovery when $ X_{0}$ has $O(\sqrt {n})$ zeros per column, or require multiple rounds of semidefinite programming relaxation to work when $ X_{0}$ has $O(n)$ nonzeros per column. Our algorithmic pipeline centers around solving a certain nonconvex optimization problem with a spherical constraint. In this paper, we provide a geometric characterization of the objective landscape. In particular, we show that the problem is highly structured with high probability: 1) there are no “spurious” local minimizers and 2) around all saddle points the objective has a negative directional curvature. This distinctive structure makes the problem amenable to efficient optimization algorithms. In a companion paper, we design a second-order trust-region algorithm over the sphere that provably converges to a local minimizer from arbitrary initializations, despite the presence of saddle points.