COMP30261 - Assessment Details Laboratory Report
COMP30261 - Assessment Details Laboratory Report
COMP30261 - Assessment Details Laboratory Report
Work handed in up to five working days late will be given a maximum Grade of Low Third whilst work
that arrives more than five working days will be given a mark of zero. Work will only be accepted
beyond the five working day deadline if satisfactory evidence, for
example, an NEC is provided: https://www.ntu.ac.uk/studenthub/my-
course/studenthandbook/submit-a-notification-of-extenuating-circumstances
The University views plagiarism and collusion as serious academic irregularities and there are a
number of different penalties which may be applied to such offences. The Student Handbook has a
section on Academic Irregularities, which outlines the penalties and states that plagiarism includes:
'The incorporation of material (including text, graph, diagrams, videos etc.) derived from the work
(published or unpublished) of another, by unacknowledged quotation, paraphrased imitation or other
device in any work submitted for progression towards or for the completion of an award, which in any
way suggests that it is the student's own original work. Such work may include printed material in
textbooks, journals and material accessible electronically for example from web pages.'
If copied with the agreement of the other candidate both parties are considered guilty of Academic
3
Irregularity.
Please remember submitting portions of work already assessed is Self-Plagiarism and is also a serious
academic irregularity.
Penalties for Academic irregularities range from capped or zero grades for elements of modules, to
dismissal from the course and termination of studies.
"To ensure that you are not accused of plagiarism, look at the NOW page
Plagiarism support and Turnitin support for guidance."
By presenting such material as your own words you are violating Academic Integrity policy, a matter
that NTU takes very seriously.
The skills you develop during your time with us allow you to interrogate material and evaluate it,
important skills in all careers. ChatGPT does not allow you to develop these.
4
Laboratory Tasks & Report
Main Tasks
Click on the links below to download the input file.
dftRoadSafetyData_Casualties_2018.csv
OR
dftRoadSafetyData_Casualties_2018.csv
It is the Road Safety data for year 2018. You need to study the file to understand the details.
The file contains the index, vehicle reference, casualty reference, gender, age, severity, and many other
details.
The goal and the task are to find out the total number of accident index occurrences in each band of
casualty severity. This will be achieved using Java MapReduce approach under Hadoop framework and
configured a multi-cluster (minimum of 3 nodes) distributed environment for data processing. All
relevant screenshots at every step showing your personal details and configuration files must be
provided in the laboratory report.
As a result of this activity, you will produce a deliverable in the form of a Laboratory report containing the
content listed in the table below containing the three sections.
You must provide Laboratory Report for installations, configurations, and the associated tasks.
As shown in above table, a report up to 10 pages of content is recommended (Screenshots, title page,
contents page, references, bibliography, and appendices will not be counted as part of the 10 pages).
You can provide as many screenshots as you want, and these should be referenced appropriately in the
appendix section. You should use the Microsoft heading style. The font size to be used is Calibri (Body),
11 points for the main text. Note that penalty will be applied if the number of pages is more than 10
pages.
• Begin with a clear problem statement outlining the objectives of the task and what it aims to
achieve.
• 5
Highlight the key features of your installation and configuration, including any additional
capabilities that extend beyond the task requirements.
• Provide a brief literature review to contextualize your work and align it with current industry
practices and technologies.
• Describe the installed components and the configurations used in the setup, detailing the steps
taken during the implementation process.
• Include relevant screenshots that demonstrate the configuration details and provide evidence of
successful installations.
• Explain the data storage setup on the multi-node clusters, including details on how data is
distributed and managed across nodes.
• Provide a step-by-step guide on how to install and configure Java, Hadoop, and Apache Spark in a
multimode distributed system.
• Describe how the configured Hadoop clusters are employed in a distributed environment to
facilitate data processing.
• Use the MapReduce framework for solving data processing problems and analysing files stored in
HDFS formats. Show and explain your implementation with examples.
• Present the results from both the terminal and the web user interfaces (Hadoop Data Nodes and
Resource Manager) to illustrate the successful execution of the task.
• Provide an evaluation of the solution, discussing the effectiveness and efficiency of the
implementation.
• Offer suggestions for improvement and potential enhancements to the solution based on your
findings and evaluation.
Submission Requirements
1. The Laboratory report as detailed above in electronic format only (use .docx ONLY).
2. By the submission deadline, you are expected to submit your report to NOW Dropbox under the
`Laboratory Report’ folder.
3. You are asked to write the report using the provided report template. It is recommended to cite
and list referees using Harvard Referencing style (visit
https://www.ntu.ac.uk/m/library/referencing-made-easy).
4. A statement of ownership which states that this is your own work and provides references to
sources if elements of your system come from other works. For more information or if you are in
doubt, you must discuss this with the lab tutor. Also see the Guidance for Electronic Submission of
Lab Report on NOW in the Courses Info Module
5. NOTE that your report will be assessed according to the assessment criteria Section.
II. Assessment Criteria
6
Class/ First Upper Second Lower Second Third Fail Comments Grade
Grade/ High | Mid | Low High | Mid | Low High | Mid | Low High | Mid | Low Marginal | Mid | Low 0.0
*Exceptional *Zero 0
Distinction
Assessment Criteria
16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Section 1. Introduction & An excellent description of the A good description of account is A reasonable description of the Some description of the No meaningful description of the
Design introduction and design of the introduction and design introduction and design introduction and design introduction and design
approach and its applications. approach and its applications approach and its applications approach and its applications approach and its applications.
with appropriate reference to but may miss many details and without details.
An excellent description of the the literature but may miss some but lacks appropriate reference No information about the
Hadoop/Big data/Apache Spark details. to the literature. A reasonable Some description of the Hadoop/Big data/Apache Spark
Weighting is 0.3
framework, problems to be description of the Hadoop and Hadoop/Big data/Apache framework, problems to be
solved, and underlying literature. A good description of the Big data/Apache Spark Spark framework, problems to solved, and underlying
Hadoop/Big data/Apache Spark framework problems to be be solved, and underlying literature.
A deep understanding of the framework, problems to be solved, and underlying literature.
problem area and related solved and underlying literature. literature. No literature is provided and/or
industry. Little understanding of the consideration of intended
A good understanding of the A reasonable understanding of problem area Insufficient outcomes.
An excellent use of sources that problem area and related the problem area and related justification may not be based
evidence independent study and, industry. industry but without depth. on the literature is provided.
in some cases, content that is not
taught. Good justification and intended Reasonable justification and Some justification and * This section is missing
insight are given. intended insight given. intended insight mentioned.
Excellent justification and
intended insight are given.
Section 2. Tasks & An excellent description of tasks A good description of tasks and A reasonable description of tasks Some description of tasks and No meaningful tasks and
Implementation and implementation strategies implementation strategies with and implementation strategies implementation strategies implementation strategies with
with workflow and with excellent workflow with excellent flow with workflow with excellent with workflow with excellent workflow with excellent flow
flow chart. chart. flow chart. flow chart. chart.
Excellent relevant and important A good relevant and important A reasonable relevant and Some relevant and important No meaningful relevant and
Weighting is 0.4
screenshots provided. screenshot provided. important screenshot provided. screenshots provided. important screenshots provided.
An excellent implementation A good implementation showing A reasonable implementation Some implementation No meaningful implementation
showing Data storage on the Data storage on the multi node showing Data storage on the showing Data storage on the showing Data storage on the
7
multi node clusters. clusters. multi node clusters. multi node clusters. multi node clusters.
Excellent installation and A good installation and A reasonable installation and Some installation and No meaningful installation and
configuration of Java, Hadoop configuration of Java, Hadoop configuration of Java, Hadoop configuration of Java, Hadoop configuration of Java, Hadoop
and Apache Spark in a and Apache Spark in a and Apache Spark in a and Apache Spark in a and Apache Spark in a
multimode distributed system. multimode distributed system. multimode distributed system. multimode distributed multimode distributed system.
system.
Excellent use of MapReduce A good use of MapReduce A reasonable use of MapReduce No meaningful use of
framework for data processing framework for data processing framework for data processing Some use of MapReduce MapReduce framework for data
problem solving and analysis for problem solving and analysis for problem solving and analysis for framework for data processing problem solving and
files stored in HDFS formats. files stored in HDFS formats. files stored in HDFS formats. processing problem solving analysis for files stored in HDFS
and analysis for files stored in formats.
Excellent results will be shown A good result will be shown and A reasonable result will be HDFS formats.
and described from both the described from both the terminal shown and described from both No meaningful result will be
terminal and the Web user and the Web user interfaces the terminal and the Web user Some result will be shown and shown and described from both
interfaces (Hadoop data nodes (Hadoop data nodes and interfaces (Hadoop data nodes described from both the the terminal and the Web user
and Resource manager). Resource manager). and Resource manager). terminal and the Web user interfaces (Hadoop data nodes
interfaces (Hadoop data and Resource manager).
Go above and beyond the nodes and Resource
requirements. manager). * This section is missing
Section 3. Critique, An excellent, clear and concise A good, clear and concise A reasonable, clear and concise Some little and clear and No meaningful and clear and
Discussion and Summary. summary of the work and summary of the work and summary of the work and concise summary of the work concise summary of the work
findings is given. findings is given. findings is given. and findings is given. and findings is given.
An excellent account of the A good account of the insight A reasonable account of the Some little account of the No meaningful account of the
insight gained is provided gained is provided together with insight gained is provided insight gained is provided insight gained is provided
Weighting is 0.3
together with clear indications as clear indications as to how to together with clear indications together with clear indications together with clear indications
to how to advance the work advance the work further. A as to how to advance the work as to how to advance the as to how to advance the work
further. An excellent discussion good discussion of what worked further. A reasonable discussion work further. Some little further. No meaningful
of what worked well and what well and what did not work well. of what worked well and what discussion of what worked discussion of what worked well
did not work well. Provides an Provides a good critique of tasks did not work well. Provides a well and what did not work and what did not work well.
excellent critique of tasks and and discussion of work. Provides reasonable critique of tasks and well. Provides some little Provides no meaningful critique
discussion of work. Provides an a good way of how further work discussion of work. Provides a critique of tasks and of tasks and discussion of work.
excellent way of how further could be improved. reasonable way of how further discussion of work. Provides Provides no meaningful way of
work could be improved. work could be improved. some little way of how further how further work could be
work could be improved. improved.
*Critique, discussion and
summary are perfect. * This section is missing
III. Feedback Opportunities
Formative (Whilst you’re working on the coursework)
You can receive formative feedback from me at any time. Please contact me by email
for my feedback.
Results and feedback will be available from Dropbox on the relevant date.
V. Moderation
The Moderation Process
All assessments are subject to a two-stage moderation process. Firstly, any details
related to the assessment (e.g., clarity of information and the assessment criteria) are
considered by an independent person (usually a member of the module team).
Secondly, the grades awarded are considered by the module team to check for
consistency and fairness across the cohort for the piece of work submitted.
Professional Practice
Understand more about the demands of a professional work context and
possible scenarios.
Learn how constraints and requirements affect working practice (quality,
time, cost, commercial considerations, materials, and documentation
standards etc.)
Links to Industry
Increase your awareness and understanding of how
industries/organisations operate, e.g., opportunities, challenges, and latest
developments etc.