Formal assessment is a way to gauge the amount of knowledge students have retained. Since all students take the exact same form of the test, the teacher can make comparisons among students.
For teachers, formal assessment results in a score that is recorded in the student’s file and used as part of their overall grade for the course (in other words, it’s often also a summative assessment).
For schools or school districts, standardized tests are administered to all students to be used for comparison purposes. School administrators can determine how their students are doing compared to other schools and districts. This allows them to identify their strengths and areas that need improvement.
Meet the Peer Reviewer: The review process onHelpful Professor involves having a PhD level expert fact check, edit, and contribute to articles. This article was written by Dr Dave Cornell and peer reviewed by Dr Chris Drew. Learn more about Chris Drew here.
Definition of Formal Assessment
A formal assessment can be defined in multiple ways. Generally, we define it in contrast to informal assessment.
For example, Barker (2004, p. 9) defines formal and informal assessment like this:
“In a formal assessment some kind of structure is emphasized. Usually, this has been planned and studied carefully, usually through research.”
“In an informal assessment the information is colleceted by less structured, perhaps even haphazard, methods.”
Peer Reviewer’s Note: Here, we can see that a central characteristic of formal assessment is that it is planned and studied. Furthermore, as noted in Dr. Cornell’s introduction above, we often also consider it to be standardized (although not always) and leading to a formal grade (as opposed to just for gathering formative feedback – see our article on formative assessment for more).
Formal Assessment Examples
- End-of-term university exam: At the end of every term, the history teacher administers a 50-item multiple choice and True/False exam.
- Standardized tests: In order to receive federal funding, a school district administers a standardized achievement test to all students at the end of the academic year.
- University admissions testing: Universities require all applicants to take the SAT or Act so the admissions committee can determine how their level of preparedness compares to other applicants.
- Teacher accreditation tests: Before being granted a teaching license, most U.S. states require aspiring teachers to take the Praxis Series exams.
- Law school admissions tests: Law schools in the U.S. require all applicants to take the LSAT as part of the application process.
- Graduate Record Exam: The Graduate Record Exam (GRE) is designed to measure a student’s ability to do well in graduate school, especially in the social sciences.
- Cumulative Testing: Students in a nutrition course have to analyze the nutritional profile of the school’s lunch at the end of every month as a component of their overall grade.
- IELTS: The IELTS test is a formal English language test that results in scores for reading, writing and comprehension that are used by governments like Canada in immigration assessments.
- Health Inspections: If a health inspector comes to a restaurant and checks it for cleanliness, the assessment will affect the outcome of the restaurant’s accreditation. Therefore, this is a formal assessment.
- Final Quizzes: A math teacher gives her students a quick quiz every Monday. Scores will be added up and reported to parents at the end of the term.
Approaches to Formal Assessment
1. Computer Adaptive Tests (CAT)
The use of paper-and-pencil tests is gradually disappearing. Due to environmental issues regarding waste and the ability to rapidly score exams automatically, testing via computers and the internet are becoming mainstream.
A computer adaptive test (CAT) adds an additional feature. The computer will adjust the difficulty of questions based on the test-taker’s performance on each item.
The answer to each question determines the level of difficulty of the subsequent item. If the test-taker answers correctly, then the next item is slightly more difficult. If the test-taker answers incorrectly, then the next item will either be at an equal or lower level of difficulty.
This kind of real-time adjustment makes the test more accurate:
“Over the course of several decades, research has repeatedly demonstrated that CAT is more efficient than paper-and-pencil tests, with equal or better measurement precision” (Seo, 2017, p. 8).
2. Multi-Method Assessment
Some students are very good at taking certain types of tests. For example, students with high verbal skills are good at writing short answers and essays. However, other students may be kinetic learners, so they will better at performing skills or demonstrating their understanding through action.
Therefore, teachers should implement formal assessment procedures that utilize a variety of testing methods. A student’s final score in a course should be comprised of their scores on several types of tests.
The final score could come from performance-based assessment, such as oral presentations or designing infographics. Project-based learning could be demonstrated by working in a team to construct a 3D object or produce a poster.
When there are multiple methods of formal assessment, it gives each student an opportunity to do well according to their unique characteristics.
3. The Redesigned SAT
The College Board, which designs and administers the SAT, redesigned the test in 2016. The test was redesigned to better reflect the content of secondary schools and the core knowledge and skills that are necessary for success at the college level.
In order for the test to be useful however, it must possess predictive validity. That means that scores on the test taken after graduating from secondary school should be highly correlated with student GPAs the first year of college (FYGPA).
A study by Westrick et al. (2019) from the College Board included 171 colleges and a total sample of over 200,000 students. Demographic information indicated that the sample was heterogeneous and relatively representative of the general population.
According to the researchers, “Findings from the current study affirm the value and effectiveness of the SAT as a tool for institutions to use to inform decisions related to admission…” (p. 20).
More specifically, “SAT scores are strongly predictive of college performance—students with higher SAT scores are more likely to have higher grades in college” (p. 20).
Peer Reviewer’s Note: Evidently, the SAT test is a prime example of a formative assessment. This example of the redesigned SAT also fits well into the definition from Barker (2004) provided earlier, which highlights the role of “planning” and “studying” the tests to ensure they’re well-designed and measure what they intend to measure.
4. Norm-Referenced Tests
Knowing one’s score on a test is most informative if it can be compared to others. A score in isolation reveals little information about the test-taker’s level of knowledge or skills.
A norm-referenced test is a standardized test that has been given to a large sample of individuals. A score on the test can then be compared to the performance of others that took the same test.
For example, receiving a score of 83% may sound mediocre. However, if the highest score in the population was 85%, then all of a sudden, the 85 looks impressive. Perhaps the test was exceptionally difficult and so no one could achieve a score in the 90s.
To provide more exact comparison information, scores on standardized tests are usually reported in terms of percentiles. In addition to revealing the test-taker’s absolute score, performance is expressed in terms of the percentage of people they performed equal to or better.
For example, a test-taker that scored at the 96th percentile means that they scored as well as or better than 96% of test-takers.
5. Criterion-Referenced Tests
Sometimes teachers or certification agencies use formal assessment to determine if test-takers have acquired a certain level of knowledge. Scores are not compared to others, but are instead compared to a set of standards.
The criteria are already established and defined in a very specific and precise manner. They represent what test-takers are expected to know or be able to perform.
Therefore, if a test-taker achieves a certain cut-off score, then they pass. This means the test-taker can move on to the next stage of academic study or receive a certificate or license.
Criterion-referenced tests are also used in proficiency-based learning, which refers to a philosophy of education based on students demonstrating mastery of knowledge or skills.
Test-takers that reach a certain level of proficiency in the designated domain are then allowed to progress academically. If, however, the test-taker fails to demonstrate mastery, they are given additional support until they can achieve the expected level of performance.
Formal assessment is a test that is used to determine how much a student has learned. In the classroom, this can involve a student’s grade in a course which is comprised of paper-and-pencil exams, project-based learning assignments, or performance-based demonstrations of learning (such as in the case of authentic assessment).
School districts use standardized tests to understand how their districts compare to others in the state or nation. This allows them to understand what they are good at and areas to target for improvement.
Universities use formal assessments such as the SAT and GRE to gauge the likelihood that an applicant is prepared for more advanced academic study.
Barker, P. J. (2004).Assessment in psychiatric and mental health nursing: in search of the whole person. Los Angeles: Nelson Thornes.
Brookhart, S. M. (2004). Assessment theory for college classrooms. New Directions for Teaching and Learning, 100, 5-14. https://doi.org/10.1002/tl.165
Seo, D. G. (2017). Overview and current management of computerized adaptive testing in licensing/certification examinations. Journal of Educational Evaluation for Health Professions, 14. https://doi.org/10.3352/jeehp.2017.14.17
Kane, M. T. (2006) Validation. In R. Brennan (Ed.), Educational Measurement, 4th Edition (pp. 17-64). Washington, DC: American Council on Education.
Westrick, P. Marini, J., Young, L., Ng, H., Shmueli, D., & Shaw, E. (2019). Validity of the SAT® for Predicting First-Year Grades and Retention to the Second Year.
Dave Cornell (PhD)
Dr. Cornell has worked in education for more than 20 years. His work has involved designing teacher certification for Trinity College in London and in-service training for state governments in the United States. He has trained kindergarten teachers in 8 countries and helped businessmen and women open baby centers and kindergartens in 3 countries.
Chris Drew (PhD)
This article was peer-reviewed and edited by Chris Drew (PhD). The review process on Helpful Professor involves having a PhD level expert fact check, edit, and contribute to articles. Reviewers ensure all content reflects expert academic consensus and is backed up with reference to academic studies. Dr. Drew has published over 20 academic articles in scholarly journals. He is the former editor of the Journal of Learning Development in Higher Education and holds a PhD in Education from ACU.