Evaluation (Chapter 7)

Evaluation

 

7.1. Evaluating the impact of the research

7.2. Evaluating the artefact

7.2.1. Heuristic Evaluation

7.2.2. Evaluating the Artefact: Sample end Users

7.3. Evaluating the project

 

Evaluation is the last but important part of software developing. In paid projects this proves the completeness of job and the money paid for its purpose. In non-paid projects, this contains the development of knowledge and testimony of project completeness.

 

By following the subject of the project, Evaluation shows the sections which resulted development and how they were implemented Prat, Comyn-Wattiau, & Akoka, (2015). The primary research was evaluated in this chapter first. Then, the final product was evaluated with the viewpoints of the Heuristic Evaluation. The application was also evaluated by the sponsor and sample user group too. By the end of the chapter, the whole project was evaluated.

7.1. Evaluating the impact of the research

At the beginning of the project, the purpose of the research was to introduce the environment of the problems and to start working towards fixing these problems. The research was going over the basics, well known IT solutions and user behaviour.

 

After the research it seemed that enough information was gained, but throughout the development and testing new problems were encountered, which required further research. The reason for the frequent update were the connectivity of outdated sensors to up-to-date mobile devices and the incompatibility of these. The continuous updates to the operating system of the mobile device has also made reaching the final goal much more difficult. The inadequate knowledge of the sensors has also caused problems. One of the principles was that extensive usability is very limited on cheap external sensors.

 

The research has reminded me of one of the eternal truths, whereby the more somebody knows, the quicker they will realise how little they actually know.

7.2. Evaluating the artefact

7.2.1. Heuristic Evaluation

When evaluating a software one of the most commonly accepted methods is the Heuristic Evaluation by Hermawati & Lawson (2015). This method looks at a software's usability from 10 different points. The benefits of this method is that it’s cheap, quick and easy to carry out, so this was chosen for evaluating.

7.2.1.1. Visibility of system status

An important aspect, is that the user is always aware of the current status of the system at all times. To create this into the application, ProgressDialog boxes show the current time-consuming processes. In the case of the Counters and Timers a circular Progress Bar shows visually the elapsed and remaining time. The completed and remaining tests are shown by brighter question numbers or darker grey ones, which information is given to the user on top of the screen.

7.2.1.2. Match between system and real world

To make real world connection the tasks in the tests were developed by such methods, which have been used for a long time in practice. For example the TOPS mental questionnaire or the Flamingo Balance Test of Eurofit Fitness. Furthermore the ease of use was an important aspect, so the tests provide short and easy to understand instructions to support the completion of each test.

7.2.1.3. User Control and freedom

The application handles user interaction with easy to understand buttons. There is always an opportunity to stop any of the tests and to go back to the starting screen. Functions that are rarely used, such as sign out, documentation and user data can be changed anytime by using the Action Bar menu.

7.2.1.4. Consistency and standards

The ease of use and attainment is achieved by ensuring that the test system is consistent. Before every task, the user is shown a home screen, which give the user information on the tasks involved in the next test. Here the user can read a longer description of the task, start the test or interrupt said test. Tasks that require preparation, when started will give a 5 second countdown before the test start, and it looks the same for any other task.

7.2.1.5. Error prevention

The entry of invalid data format is checked against and exact information about the error is given by the Validation Class. For example when entering an e-mail address, after typing each character the system checks if the given information is in the right format and only when the correct format is achieved, the information is recorded. At critical points in the program 'try', 'catch' routines and 'if' loops prevent the system from turning off.

7.2.1.6. Recognition rather than recall

To aid the user, personal information only has to be entered once. These are stored in different location, for example the SharedPreferences or external database. Also if the user wishes to modify personal information a relevant page is shown to the user, in which all previous information is shown and only the necessary changes have to be made. When registering there are 'Spinner' fields where necessary, which makes sure that only valid information can be given. Wherever there is a large amount of data, it is shown in graphs, making it easier to process.

7.2.1.7. Flexibility and efficiency

From a flexibility point of view the project is limited in certain areas, however this limitation is deliberate and justified. The users are only able to complete the tests in a certain order. This was done, because if the tests can be chosen in any order it could influence the results of the following exercise and the comparisons later could become inaccurate as a result. To use the software, a large amount of experience is not required. There are many instructions and the application guides the users through as there is only one way to go through the tasks.

7.2.1.8. Aesthetic and minimalist design

When working on the design, overcrowding was avoided and clarity was one of the main aspects. Every page only has 5-6 objects on and only the necessary functions were added. The visible objects were placed in order of importance, with the most important objects being placed in the middle. The order of the buttons were designed so the right side of the screen will take the user forward, while on the left side the buttons will take the user a step back.

7.2.1.9. Recognize, diagnose, and recover from errors

When the program stops because of an internal error, for example the losing connection with the Bluetooth device, the program offers to the user to start the test again. If there are foreseeable errors, the users are informed with exact details regarding the error.

7.2.1.10. Help and documentation

The support for the use of the application can be read anytime by using one of the options in the top menu. Before every task, instructions are shown to the user. A 'MANUAL' button is available, which gives more detailed information on the tests. The experiences or questions of the users can be sent by using one of the options in the 'CONTACTS' menu.

7.2.2. Evaluating the Artefact: Sample end Users

The evaluation of the application by an audience took place on 14/03/2016 in the Darwin building of the University of Sunderland Sport Science department. The evaluators are sport science students at the University. Three groups took part in the study, each starting at different times, one at 9am, another group at 11am and finally the last group at 3pm.

 

At the start of the evaluation each participant received a Study Information Sheet and a Consent Form, which they certified with signatures, to show that they are willing to volunteer to participate in the evaluation. Before everyone signed they have been informed verbally about the subject, the environment and requirements of the project. The handouts of the first group are numbered between 003-015. The second group received numbers between 016-25. The third group used numbers between 026-034. The participants of the questionnaire were identified by using their participant numbers.

 

The small classes of 9-13 people have been divided into 5 groups as there were only 5 tablets and Mi bands available for testing. Alongside the tablets everyone received a task list and a questionnaire. The questionnaire asks about the tasks found on the task list, to find out how difficult they were. Also to see whether the completion of the tasks have caused problems. Another subject of the questions were asking about the effectiveness of achieving the purpose of the tasks. Altogether, throughout the day 15 questionnaires were filled out about the subject of the application by people who were directly involved.

 

The study only includes written notes, as not everyone agreed to image or sound recording. The first task was registration in the application. There is no personal questions in the questionnaire, because all information was acquired through the registration process.

 

The evaluation made by the first group has pointed out a number of problems. The dialog boxes used for the questionnaire there are NEXT and CANCEL buttons, these are too close to each other. There was an instance where the questionnaire had to be restarted due to a user accidentally exiting the questionnaire. During the tests, only one of the Mi Bands produced accelerometer results out of 5, which was much worse than what was experienced during development. But by using the manual controls, it was not a large problem for completing the tasks. The first group had not experienced any problems while going through each task. The participants did not have any issues or suggestions regarding the application.

 

The second group was informed about the problems that the first group found, so the second group could avoid them, there were no further problems with the buttons. This group has paid the most attention and they have performed each task carefully, therefore the best results were achieved with this group. Three out of 5 of the Mi Bands functioned correctly in this group. From the previous opinions it seemed that the 3 minute timeframe for the squat was too long and reducing this timeframe should be considered. However there were two participants who could hold it for over 80 seconds, which shows that reducing the timeframe may not be necessary. The third physical activity, running on the spot, the number of steps taken was not recorded correctly by the Mi Band. The cause for this was that the fitness band goes around the user's wrist, meaning it counts the number of steps taken based on arm movement. The software informs the user about this problem, but this cannot be solved through the software. Also there are no calibration options in the Mi Band. The participant marked 016 has also felt that some question in the mental questionnaire may have been repeated. The mental questionnaire was written based on methods that have been used for a long time and they were professionally accepted, so the feeling may have been subjective. Some of the tester did not feel that the first physical test was effective to measure concentration. However other testers who have not paid enough attention have performed poorly. The same participant has found running on the spot too boring and squatting too difficult. As these complaints were not made by the other participants, this can be considered as personal opinion, it does not necessarily require attention. Software stopping or crashing did not occur in this group.

 

The third group was the smallest, so the study took less time. With this group the Mi Band's accelerometer has worked 5 out of 5 times. The previous problems could have been caused by the fact that there were a lot of Bluetooth devices used in a small room, which could have caused interference. In this group 034 had made a comment regarding the questionnaire. The comment was regarding the first physical test. The participant though that it was not an effective way to measure the level of concentration. As mentioned previously concentration was an important aspect, but also fine, small movements were measured by this test and it is good for that too. This problem can also be noticed in the pilot test. If returning users make these suggestions and the achieved score also points towards the need for improvement, it will be considered and may require some re-working. The application has worked in this group without any errors or crashing.

 

The following can be concluded based on the evaluation, that the application was successful in achieving its goal. All the documents produced during the evaluation are, will not fit into the appendix, so only a few examples are available in section H. All the declarations and questionnaires, can only be found fully on the DVD.

 

The size of the project is also shown in the file list in appendix H. Based on the list, for the application to function there is more than 15.000 lines of code.

7.3. Evaluating the project

Evaluating the project as a whole, we can make the following conclusions. As this was the first large project we have done, we have met with a number of new tasks. The syllabus of the module gone through the different tasks that were used during the development of the software.

 

The expectation was continuous monitoring, for example weekly updating the Leaning Logs and the use of the ePortfolio system. The importance of this did not apparent until later, so the missing documentation had to be added later on. This required substantial additional work. The schedule also had some unforeseen variables, which made the project more difficult and the schedule was not followed without any problems. Modules running alongside this project have also caused delays. For the main chapters of the dissertation, the required time was unbalanced. Although the objective did not include the full completion of the application, but due to my own mental attitude my main focus was shifted towards the application. Although the application was completed, it had caused the expected parts to be incomplete.

 

The continuous Supervisor and Sponsor meetings have helped overcoming obstacles set by certain situations. Although the all the planned function are done, the effectivity of those can only be found out through extensive use, therefore would require continuous fixing and development.