This paper proposes a novel learnable linear transform, locallystructured unitary network (LSUN), that captures tangent spaces of a manifold latent in high-dimensional data, enabling effective, systematic, and highly interpretable data-driven dimensionality reduction. LSUN provides a linear layer that has locally controllable filter kernels with shift-variability under the structural constraint of global unitary property. It is similar to a convolutional layer as the filter kernels share the properties of overlapping and locality, while fixed kernels are not repeated. The kernels can be trained in a self-supervised manner owing to the unitary property. The proposed method can be a candidate for realizing manifold learning. Although local selection of filter kernels, such as sparse modeling, can capture tangent spaces as a set of coordinates, the set of kernels is redundant, and the filters are not very interpretable. To address these problems, this study utilizes a method that locally controls coordinate axes by combining some primitive local linear operations that preserve unitarity, such as Givens rotation, shift, and butterfly operations. This study evaluates the ability to capture the tangent space of the proposed LSUN through low-dimensional approximation and dynamical system modeling experiments.