An easy-to-apply series of field test for physical education teachers in an educational setting: ALPHA test battery

Valid and reliable measurement methods are the basic requirements that enable the measurement and evaluation process to reach education's basic goals. Usually, using laboratory-based tools and methods for measuring the physical fitness of students is not possible within school settings. Therefore, field-based tests provide crucial solutions to physical education teachers to measure both the physical fitness of school-aged students and the athletic competence of a school’s athletes. This study aims to identify the differences in physical-fitness levels between student athletes and non-athletes and determine the effectiveness of the Assessing Levels of Physical Activity (ALPHA) test battery in discrimination between these groups. Sixty-eight healthy male students (34 athletes and 34 non-athletes) participated in this study. As a major indicator of physical fitness in student athletes and non-athletes, the body mass index for assigned study groups was controlled. After a familiarization training, participants completed three test sessions in 48-hour intervals. The 20-m progressive shuttle run test for cardiorespiratory fitness, handgrip strength test and standing broad jump test for musculoskeletal fitness, and the 4 × 10-m shuttle run test for motor fitness were used to measure the fitness levels of the groups. A t-test was used to determine the differences between athletes and non-athletes, and effect sizes were calculated to assess practical importance. Additionally, a discriminant function analysis was used to determine whether the ALPHA test battery could differentiate between athletes and non-athletes. The findings indicated that student athletes presented with significantly greater levels of fitness than non-athletes. Additionally, when the effect of body mass index was eliminated, student athletes and non-athletes were classified correctly at a rate of 70.6% using these tests. Therefore, this study shows that physical education teachers can use the ALPHA test battery to monitor athletic performance and identify talented students.


Introduction
Measurement and evaluation are the major concepts used in education to monitor the learning progress of students and to assess the final learning outcomes (Adom, Mensah, & Dake, 2020;Myers, Lee, & Silverman, 2019;Rink, 2013). Valid and reliable measurement methods are the basic requirements that enable the measurement and evaluation process to reach education's basic goals (Scruggs, Mungen, & Oh, 2010;Vermunt, Ilie, & Vignoles, 2018). Suitable methodologies need to measure the achievements of each discipline in the school settings (Adom, Mensah, & Dake, 2020;Caspersen, Frølich, Karlsen, & Aamodt, 2014). Physical Education requires different measurement strategies than other disciplines as it includes psychomotor characteristics addition to cognitive skills (Myers, Lee, & Silverman, 2019;Vaughn, Hur, & Russell, 2019). In the current education systems, Physical Education teachers are expected to both measure the physical condition of all students and to follow the development of the athlete students involved in school sports (Armour & Yelling, 2007;Özkara & Kalkavan, 2018). For this reason, in the Physical Education, measurement methods satisfying the need both the general student population and more specific groups such as student athletes provide important advantages.
Schools are the most ideal environments for children and adolescents to meet the need for physical activity suggested by the World Health Organization (WHO) and to adopt a healthy lifestyle (WHO, 2018). Considering that the physical activity opportunities of students in out-ofschool environments are gradually decreasing, physical education classes and schools present important opportunities for students (Hallal et al., 2012;Özkara, 2018). For this reason, encouraging students to be physically active is one of the most up-to-date goals of physical education teachers (Alexandr, Sergij, & Olena, 2016). The physical fitness level of all students must be above a certain threshold whether or not they are part of a sports team (Ruiz et al., 2011;WHO, 2018). Since physical fitness includes different skills such as endurance, strength and flexibility, appropriate measurement methods should be used in the measurement and evaluation process (Chen, Hammond-Bennett, Hypnar, & Mason, 2018). Because the data obtained from these measurement methods are used to evaluate the current health status of students and form the basis of the strategies to be applied in the future (Bianco et al., 2015;Ruiz et al., 2009;Tabacchi, 2011).
Studies show that metabolic disorders that develop during adolescence continue to grow and increase in severity into adulthood. For this reason, it is particularly important to identify children with metabolic disorders, such as obesity, at an early age and to implement preventive health services for these children (Byrd-Williams et al., 2008;McMurray, Bangdiwala, Harrell, & Amorim, 2008;Ruiz et al., 2009;Twisk, Kemper, & Van Mechelen, 2002). Many governments use field measurements to monitor students' physical fitness and basic motor competencies (Bianco et al., 2015;Ruiz et al., 2011). Today's schools and physical education teachers play an important role in assessing the physical fitness of students (Ruiz et al., 2011;Tabacchi, 2011). Although there are many tests designed to assess a single aspect of physical fitness, it is crucial to combine tests that will provide a comprehensive measurement of students' physical fitness.
Although there are valid and reliable research-based tools and methods to measure students' levels of physical fitness, it is not usually possible to duplicate laboratory conditions in school environments (Ruiz et al., 2011). Therefore, because they are time and cost efficient, have low equipment requirements, and are easy to apply to a large group of people simultaneously, fieldbased testing approaches can present reasonable options to measure the physical-fitness levels of different student populations (Bianco et al., 2015;Ruiz et al., 2011). For example, the ALPHA (Assessing Levels of Physical Activity) test battery was designed by Ruiz et al. (2011) to evaluate the physical-fitness levels of children and adolescents. The ALPHA test battery was created with reference to 15 different fitness test batteries and numerous cross-sectional studies and aims to present comparable methods for international use. This test battery uses the 20-m shuttle run test as the indicator of cardiovascular fitness, the handgrip strength and standing broad jump tests as the indicator of musculoskeletal fitness, and the 4 × 10-m shuttle run test as the indicator of motor fitness (Ruiz et al., 2011).
The 20-m shuttle run test was preferred in numerous experimental studies and other major field-based test batteries, such as EUROFIT (Council of Europe Committee for the Development of Sport [CECDS], 1988) and ASSO-fitness (Bianco et al., 2015;Tabacchi, 2011) because it has been considered as a valid test for estimating cardiorespiratory fitness (Colombo-Dougovito, 2013;Leger, Mercier, Gadoury, & Lambert, 1988;Mayorga-Vega, Aguilar-Soto, & Viciana, 2015). The handgrip strength and the standing broad jump tests are reliable assessments of musculoskeletal fitness and have been used in scientific studies for more than 50 years (Barfield, Channell, Pugh, Tuck, & Pendel, 2012;Everett & Sills, 1952;Kane & Meredith, 1952;Newman et al., 1984;Wind, Takken, Helders, & Engelbert, 2010). The 4 × 10-m shuttle run test, which demonstrates speed and agility, is a proven positive indicator of bone mass in young people, and many investigations prefer it as a simple and useful field-based fitness test (Ortega et al., 2011;Ortega, Ruiz, Castillo, & Sjöström, 2008). The ALPHA test battery, therefore, includes tests that are suitable for physical education teachers to use for measuring both physical fitness status of all students and the athletic competence of student athletes.
The hypothesis of this study is that these tests can reveal differences of physical properties between student athletes and non-athletes. If this hypothesis is confirmed, this study could provide a new argument for the use of these tests in the measurement of athletic competences for student athletes. Therefore, this study aims to determine the differences in fitness levels between student athletes and non-athletes and to assess the effectiveness of the ALPHA test battery to distinguish the groups.

Participants
Sixty-eight healthy male students (34 athletes and 34 non-athletes) volunteered to participate in this study. In the present study, a main inclusion criteria for the non-athlete group was to have similar body mass index as one of the athletes. The other inclusion criteria of the non-athletes were not to use medication and not having a current disability. Inclusion criteria for athletes were as follows: take part in school sports such as football, basketball, volleyball and table tennis for at least two years and train at least two days a week. All participants were recruited from a high school. Comprehensive verbal explanation of the procedure and purpose of the study, and the possible risks was given to the participants and their parents, and written informed consent was obtained.

Procedures
Each participant completed three test sessions (cardiorespiratory-, musculoskeletal-, and motorfitness) separated by 48 hours intervals. The reliability and validity of all tests used in this study have been proven in different studies and performed in many children and adolescent populations Mayorga-Vega et al., 2015;Ruiz et al., 2009;Ruiz et al., 2011;Ruiz et al., 2008;Wind et al., 2010). Athlete group was tested in random order but non-athlete group followed the order of his partner (the athlete who has the similar body mass index as himself) in the other group. Anthropometric measurements were administered in familiarization session. All of the measurements were administered by first researcher in our school's indoor sports facility. The tests were performed between 09:00 and 11:00 AM. The participants were verbally encouraged to maximize their performance during the tests. In addition, they were not allowed to use any supplements during this period, or perform any exhaustive activity 24 hours before the testing days.
The height of the participants was measured with a portable stadiometer according to standard procedures (Holtain Ltd, Crosswell, Crymych, Dyfed, UK). Body weights were measured with barefoot and in shorts and t-shirt using Tanita weighing scale (BC-310; Tanita Corp., Tokyo, Japan). Body mass index was calculated as body mass/height squared (kilograms per square meter, kg.m -2 ). Waist circumference was measured using a flexible (but non-elastic) tape measure, over a single layer of clothing (0.5 cm was deducted), approximately midway between the top of the iliac crest and lower border of the bottom rib.
The 20 meters progressive shuttle run test  was performed to measure cardiorespiratory-fitness of participants. Participants were instructed to run between two lines 20m apart, while keeping the pace with audio signals recorded in the smartphone and transferred to the amplifier sound system. The participants were introduced the test and given the opportunity to practice for about ten minutes in familiarization training. The initial speed was 5.0 km.h -1 , which was increased by 0.75 km.h -1 per minute (1 min equal one stage). Participants were instructed to run in a straight line, to pivot on completing a shuttle, and to pace themselves in accordance with the signals. The test was finished when the participants failed to reach the end lines concurrent with the audio signals on two consecutive occasions. Otherwise, the test ended when the participants voluntarily stopped to run due to fatigue. All measurements were carried out under standardized conditions on an indoor sports facility. The participants were verbally encouraged to keep running as long as possible throughout the course of the test. The total lap repetitions of the participants were recorded.
Standing broad jump (SBJ) and handgrip strength (HGS) tests were performed to measure musculoskeletal-fitness of participants. For the SBJ test, the participants were introduced the striped area and given the opportunity to practice 3-5 times in familiarization training. The participants stood behind the starting line and were instructed to push off vigorously and jump as far as possible. The participants had to land with the feet together and to stay upright. Jump distance was measured from the takeoff line to the point where the back of the heel nearest to the takeoff line landed on the mat. Two trials were applied and the higher score was accepted as valid.
HGS was measured in a seated position with shoulder adducted and flexed 70° via relevant dynamometer (TAKEI 5001, Scientific Instruments, Tokyo, Japan). The participants were introduced the dynamometer and given the opportunity to practice 3-5 times in familiarization training. The dominant hand was used for the hand grip strength test measurement, and participants were instructed to stand upright with their feet at hip width and to look forward with their elbow fully extended. For all handgrip strength tests, two trials were performed, and the highest result was accepted as valid.
The 4 × 10-m shuttle run test was performed to measure motor-fitness of participants. Participants ran in groups of 5-6 people. During the test, video was recorded with the help of a smartphone that can record slow motion video. Thus, all participants were controlled whether they crossed the line and their test duration was precisely measured. Two parallel lines were drawn on the floor 10-m apart. The participants were instructed to run as fast as possible from the starting line to the other line and to return to the starting line, crossing the line with both feet every time. The participants were introduced the test and given the opportunity a practice in familiarization training. Table 1 shows the design and timeline of the study.

min
The tests were performed in random order. In the first session, anthropometric measurements were performed before testing. Anthropometric measurements of all participants were completed in approximately 1.5 hours.

Data Analysis
Statistical analyses were performed using a software (IBM SPSS Statistics for Windows, Version 21.0, Armonk, NY), and p value < 0.05 was accepted significant. The normality of data was controlled using the Shapiro-Wilk test. Data were analyzed using descriptive statistics, and the results are presented as mean ± standard deviation (SD). The differences between the athletes and non-athletes were determined using an independent t-test. Additionally, effect sizes were calculated using Cohen's d (Cohen, 1988) and classified according to Hopkins (Hopkins, 2015). Further, a discriminant function analysis was used to determine whether cardiorespiratory-, musculoskeletal-, and motor-fitness tests could discriminate between athletes and non-athletes. The matrices of homogeneity were checked using Box's M test of equality of covariance. The collinearity of data was analyzed to identify correlations between independent variables. The total lap repetitions variable in 20-m shuttle run test was excluded from the discriminant function analysis model because of highly correlated ( ) with VO2max. Structural coefficient was used in order to determine the variables that discriminate between athletes and non-athletes. A structural coefficient above 0.30 was accepted as relevant for the interpretation of the linear vectors.

Results
Means and standard deviations of descriptive characteristics and fitness test values for both groups are presented in Table 2. No significant differences were detected between athletes and non-athletes groups for age, height, weight, and body mass index. However, athletes were significantly had higher values in 20m CF test laps, VO2max, SBJ, HGS, and 4x10m MF test compared with the non-athletes.   (26) 70.6% of participants correctly classified.
The independent variables selected for inclusion in the discriminant function analysis exhibited low shared variances, as exhibited by their common zero order correlation coefficients (see Table  3). The only exception to this was the shared variance for 20m CF test laps and VO2max ( ). Hence, 20m CF test laps were excluded from the discriminant function analysis. The original classification summary shows 70.6% of the cases correctly classified in their status (Table  4). This is significant when compared to the proportional chance ( ).

Discussion and Conclusion
Currently, a societal shift towards lifestyle changes that would improve physical health in adulthood appears unlikely. Therefore, a regular screening of the fitness status of children and the adolescent population as a preventative measure against metabolic disorders and their health consequences is a priority for public-health initiatives (Bianco et al., 2015;Gençoğlu & Akkuş, 2020;WHO, 2010). Physical fitness is positively associated with a healthy cardiovascular profile during childhood and adolescence and negatively associated with a clustered metabolic risk during childhood (Cohen et al., 2014;Twisk et al., 2002). The findings of longitudinal studies have supported the idea that physical fitness has a positive effect on the overall wellbeing of young people, confirming the importance of evaluating physical fitness from an early age (Byrd-Williams et al., 2008;Ruiz et al., 2009). Therefore, physical fitness has emerged as an important marker of both current and future health (Ortega et al., 2008;Ruiz et al., 2009;Ruiz et al., 2011). A growing number of students are participating in the fitness-improving activities provided by physical education classes and after-school programs. Additionally, many countries aim to improve the physical-fitness levels of the young population through projects in which schools and physical education teachers play a main role (Bernhardt et al., 2001;Faigenbaum, Milliken, & Westcott, 2003;Tabacchi, 2011). Because they are time and cost efficient, have low equipment requirements, and are easy to apply to a large group of people simultaneously, field-based testing approaches can present reasonable options to measure the physical-fitness levels of different student populations (Bianco et al., 2015;Ortega et al., 2008;Ruiz et al., 2011). When considering the mechanical, physical, and physiological skills measured by field-based fitness tests, this hypothesis argues that these tests can also differentiate between the student athletes and non-athletes. This method, then, can be used to monitor the performance of school athletes and to identify talented students for school teams.
In a different study that used similar tests with adolescent handball players, the discriminant analysis predicted successful and less successful players with an overall accuracy of 73.6% (Palamas et al., 2015). Additionally, Özbay and Ulupınar (2020) found that laboratory tests performed with classical methods can accurately distinguish elite athletes from others with an overall accuracy of 65.4%. Another study showed that there is a 0.68 correlation (r 2 = 0.46) between musculoskeletal fitness and body composition and emphasized the importance of this finding (Nikolaidis, 2014). In addition, many studies have reported that body mass index is one of the major indicators for evaluating the physical fitness of adolescents (Hasan, 2013;Nevill et al., 2010;P. T. Nikolaidis, 2013;Ortega et al., 2011;Tabacchi, 2011). Because of this, even though the body mass index was controlled in this study, the ability of these tests to correctly classify 70.6% of its participants should be considered an important finding. The difficult environmental, time, and cost requirements for measuring fitness levels involved with most methods, make the work done by physical education teachers challenging. This makes the availability of valid, reliable, and reasonable tests particularly useful in educational settings. This study provides new evidence that physical education teachers can use the ALPHA test battery as a series of field-based tests to monitor the athletic competence of students involved in school sports.
Many existing studies provide strong evidence for the validity and reliability of the 20-m shuttle run test as a test to assess cardiorespiratory fitness (Bianco et al., 2015;Lee et al., 2019;Leger & Lambert, 1982;Ruiz et al., 2011). The 20-m shuttle run test is also used to measure maximum oxygen consumption (VO2max), which is considered the main indicator of endurance for athletes. VO2max is the gold standard for evaluating cardiorespiratory fitness. Cardiorespiratory fitness provides information about the overall capacity of the cardiovascular and respiratory systems and an athlete's ability to perform long-term exercises Leger et al., 1988;Ruiz et al., 2008). The handgrip strength and standing broad jump tests have been used in scientific studies for more than 50 years as valid and reliable methods for measuring musculoskeletal fitness (Everett & Sills, 1952;Ince & Ulupinar, 2020;Kane & Meredith, 1952;Wind et al., 2010). These tests have also been used to determine isometric strength, or muscle strength, in many different sections of the population (including amateur-professional, male-female, children, youth and adults) to identify their strength-power profile (Fukuda et al., 2013;Özbay & Ulupınar, 2020;Peterson, Alvar, & Rhea, 2006;Wind et al., 2010). Similarly, according to previous studies, the 4 × 10-m shuttle run test is a reliable test for measuring motor fitness (Calatayud et al., 2017;Ortega et al., 2008). Keeping the characteristics measured by these tests in mind, this study establishes that the ALPHA test battery can be used to assess fitness levels of the overall student population and the athletic performance of the students who participate in school sports.
Consequently, the ALPHA test battery effectively differentiates between student athletes and non-athletes, and the scores obtained from these tests differ between the groups. This study suggests that the ALPHA test battery could effectively measure the fundamental performance characteristics of school athletes, independently of body mass index. Additionally, since one investigator was able to complete measurements of 68 participants in 9-10 hours, this study provides experimental results that demonstrate the easy application of these tests to large groups. Thus, this study indicates that physical education teachers can effectively use the ALPHA test battery to monitor the performance of school athletes and to identify talented athletes in a school setting.
The practical strengths of this study include the following findings: physical education teachers could use the ALPHA test battery to measure the athletic competencies of student athletes, these tests use efficient and easy-to-apply methodologies. In addition, this study presents more accurate results because of controlling the body mass index. The weaknesses of this study include the following findings: all the study's participants are males, schools in different regions are excluded, and the sample size is low. In future studies, similar test batteries can be examined, larger sample sizes can be used, students from various schools can be included, and the differences in results of these tests between different age and gender groups can be evaluated.