In this work, we propose a neural network for boundary-aware face alignment. The proposed network is composed of two stages with the first one estimating boundary heatmaps and the second one predicting landmark positions. We build the first stage by enhancing a baseline HourglassNet. Major enhancements include {the} addition of a CoordConv layer and addition of shallow and deep feature fusion (SDFusion) blocks. For the second stage, we design a subnet that firstly fuses information of the original image, a latent feature from the first stage and the boundary heatmap generated by the first stage, and secondly uses a Transformer to map the fused feature to the landmark coordinates. As shown by experiments, the proposed algorithm achieves state-of-the-art performance on the benchmark datasets.