To investigate the intra-examiner and inter-examiner reliability of physical examination to identify asymmetry of selected anatomical landmarks indicative of pelvic somatic dysfunction in subjects with and without low back pain using experienced osteopaths and final year students of osteopathy.Four examiners (two students, two osteopaths) examined a sample of symptomatic (n=5) and asymptomatic (n=4) subjects for symmetry of anatomical landmarks indicative of pelvic somatic dysfunction. Two assessments of symmetry and alignment of the posterior superior iliac spine (PSIS), sacral sulcus, sacral inferior lateral angle (ILA) in posterior–anterior (ILA-P) and inferior–superior (ILA-I) directions, anterior superior iliac spine (ASIS), and medial malleoli were performed on every subject by all four examiners. Intra-examiner and inter-examiner reliability was analysed with kappa (κ) and reported in conjunction with observed agreement (P o ).Estimates of intra-examiner reliability ranged from κ=−0.29 to 1.0 (PSIS κ=−0.29 to 0.39; sacral sulcus κ=−0.28 to 0.83; ILA-P κ=−0.29 to 0.44; ILA-I κ=−0.29 to 0.34; ASIS κ=0.25–0.63; medial malleoli κ=0.20–1.0) and were higher than estimates of inter-examiner reliability. Inter-examiner reliability estimates ranged from κ=−0.38 to 0.51 (PSIS κ=−0.38 to 0.35; sacral sulcus κ=−0.34 to 0.26; ILA-P κ=−0.18 to 0.51; ILA-I κ=−0.13 to 0.36; ASIS κ=−0.13 to 0.50; medial malleolus κ=−0.05 to 0.49). The median observed agreement between examiners for each anatomical landmark ranged from 33 to 50%. Osteopaths were more reliable on measures of the inferior lateral angle (ILA-P), while students were more reliable on measures of the sacral sulcus.In this study, the reliability of physical examination for anatomical landmarks indicative of pelvic somatic dysfunction was generally found to be low. Differences between the reliability of experienced osteopaths and final year osteopathy students were negligible. Examiners were most reliable in their assessment of the ASIS and medial malleolus; however, these estimates were not consistent and were too low to be considered clinically useful.