Semantic segmentation of remote sensing images remains challenging due to complex object structures and varying scales. This paper proposes a novel hybrid segmentation model that combines Segformer for global context extraction with Dynamic Snake Convolution to better capture fine-grained, boundary-aware features. An auxiliary semantic branch is introduced to improve feature alignment across scales. Experiments on three benchmark datasets—LoveDA, Potsdam, and Vaihingen—demonstrate that the proposed approach achieves consistent improvements in mIoU over baseline models, particularly in segmenting irregular and linear structures. This framework offers a promising solution for high-resolution land cover mapping and urban scene understanding.