By using micro-simulation method and the Bush-Mosteller reinforcement learning model, this paper modeled the behavior of urban commuters departure time choice on a many-to-one transit system during the morning peak-period. Three kinds of typical urban public transport priority policies were studied. Result shows that if we can choose the right time for free public transportation, the pre-peak-free policy will have certain effects on staggering the commuting peak by influencing commuters decision making on departure-time. As for the bus-accelerating policy, it can lower commuters cost, but it is likely to cause more congested volume and add more pressure on the public transit system. The departure frequency increasing policy can partially alleviate the peak congestion problem, but cannot fundamentally eliminate the congestion, instead, it may increase the operating costs. This research is helpful in acquiring a better understanding of commuters departure time choice and commuting equilibrium during the peak-period. The research approaches also provide an effective way to explore the formation and evolution of complicated traffic phenomena.