iCan Help You: The Benefits of Artificial Intelligence to Military Forces Outside of Warfighting Operations
Introduction
To take the King’s hard bargain is a ‘traditional description for the rendering of military service to the Crown, made inaccurate in modern times only by the gender of the current Sovereign’.1 This bargain’s hardness is multifaceted. It denotes that military service involves a unilateral agreement—that the member gives everything and expects nothing. It further represents that one takes an oath to serve within the profession of arms, whose raison d’être of warfighting is best highlighted through the role of the Royal Australian Infantry Corps:
… to seek out and close with the enemy, to kill or capture him, to seize and hold ground and to repel attack, by day or night, regardless of season, weather or terrain.2
In achieving this capability—warfighting—another aspect of the King’s hard bargain becomes apparent, one that is not as readily taught at the Royal Military College of Australia as infantry minor tactics. The day-to-day administration of personnel constitutes a significant burden on any commander, detracting from the ability to conduct training to prepare for combat, combat support or combat service support. This paper suggests a possible method for the Australian Defence Force (ADF) to help reduce the cognitive clutter surrounding its administration, policy and military discipline through the use of machine learning algorithms.
Automated decision-making systems are becoming more prevalent in government processes around the world, in areas as diverse as the administration of social security, taxation, criminal sentencing and migration.3 These systems are most likely to be deployed in branches of government that must cope with a high caseload, as well as repetitive assessments against prescriptive criteria. However, as will be shown below, automated systems can vary in nature, which is likely to have implications for the manner in which they are authorised or delegated, as well as for the risks that might be posed by indiscriminate use of those systems. Accordingly, this paper first canvasses the lexicon and meaning of terms such as artificial intelligence and machine learning, before discussing technical aspects of the processes and tools these capacities can produce. Next, it applies the solution to three problems: assisting individuals when sentencing ADF members within the military discipline system; assisting ADF decision-makers to make consistent decisions when imposing administrative sanctions; and assisting central bodies such as the Career Management Agency with posting plots and career plans. This paper will not deal with some of the more nuanced legal issues surrounding automated decision-making.
Artificial Intelligence and Machine Learning: What Are They?
Much has been written on artificial intelligence (AI) and machine learning (ML). Devi Li and Yi Du helpfully describe artificial intelligence as follows:
Intelligence can be defined as wisdom and ability; AI is a variety of human intelligent behaviours, such as perception, memory, emotion, judgment, reasoning, proof, recognition, understanding, communication, design, thinking, learning, forgetting, creating, and so on, which can be realized artificially by machine, system, or network.4
So do we need to meet all these criteria to take advantage of the developments in AI? The expectations of AI outweigh the current capabilities,5 but this is not to say that the ADF cannot use some of the developments in AI in promoting better, more organisationally useful and methodologically transparent decisions. AI research has led to a number of methods that are already in wide use across many industries, and ML is one of those subsets.
ML as a subset of AI uses statistical methods to enable computers to improve with experience using non-linear processing.6 It has shown itself useful for particular tasks and activities such as sorting data, finding patterns and trends, and completing a high-volume of repetitive tasks quickly while minimising errors. These automated systems can assist administrative decision-making in a number of ways: they can make the decision, recommend a decision to the decision-maker, or guide a user through relevant facts, legislation and policy.7 Despite the ADF’s organic tri-Service Military Legal Service (MLS) providing uniformed legal officers in relevant command formations, a majority of the ADF’s decision-making is made by commanders, legally untrained, who are required to navigate through sometimes complex legal and policy frameworks. As will be seen below, ML can help ameliorate these issues.
Of benefit is that such algorithms improve over time as they are exposed to larger datasets, and can be refined as a matter of course (issues with data quality will be discussed later on). The issue is the initial dataset: the dataset that teaches the algorithm how to come to a decision based on reducing the likelihood of creating a false positive or false negative. This would require some initial investment from the Department of Defence, although much of the data is now digitised. A core method of development is supervised learning.8 Supervised learning involves using historical data that has already provided a decision, thereby showing what the desired outcome is. The machine thus learns to correctly identify the outcome types. This is very useful for organisations, like the ADF, that produce and collect large amounts of data and should aim to have consistent outcomes, such as within sentencing and administrative sanctions.
There are, however, definitive challenges with respect to ML. First, a specific ML technique trained on a particular labelled dataset may not be suitable for another dataset or data domain.9 This means that, at this stage, there would need to be an algorithm developed for each problem set of the ADF, especially if there is a narrow output required.
Another major challenge is the nature in which ML is to be trained. Current ML algorithms require large amounts of verified data to be able to do the same thing to the degree of a child. This creates a problem with designing algorithms—there is a need to have a large datasets, and a human to verify that dataset, to properly train and verify the output of the algorithm. Supervised learning uses four classifications:
True positive — Correct identification of a correct input
True negative — Correct identification of an incorrect input
False positive — incorrect identification of a correct input
False negative — incorrect identification of an incorrect input.10
Using the supervised learning method, the false negative and false positive are the key areas that must be verified by the supervisor. This is usually done through validated datasets and identifying how often the algorithm produces a false response. Depending on the classification method used (such as decision tree or Bayesian11) the algorithm will require different levels of input. For example, a decision tree is a simple and fast method which supports incremental learning. However, it requires very accurate data and requires time-consuming training to get an effective output.
Another method is unsupervised learning which is where the algorithm uses unlabelled data and looks for patterns with limited human supervision. An example of this is cluster analysis, which groups common elements in the data and finds patterns based on the presence, or absence, of those commonalities. This method can discover features of a dataset but is less accurate than supervised learning.12
Gary Kline and Daniel Kahneman created a theory on the validation of the environment when it came to being able to intuitively predict an outcome in an environment based on the regularity of variables.13 This idea breaks down intuition and how it can and cannot be applied to different environments. Two extremes of this scale would be predicting how a house fire will act, and the price movement of a stock on the share market. A firefighter with many years of experience can use their intuition to determine whether it is safe to enter a building or even when to stop fighting a fire. This can be based on the number of variables that determine how a fire acts—this is easier to validate, and a person who experiences a large number of fires can learn to see what variables must exist to determine how it will act.
This is where fire modelling is used to determine flashpoints and how a fire will act, and can determine the action taken.14 This can be classified as a high-validation environment. The stock market, on the other hand, would be considered low validation as there are so many variables, from the economic to human behaviour, that it is currently impossible to develop intuition about the market. Applying this idea to ML brings up a question: can we ensure that all the variables that a human would consider can be entered into an algorithm to give us the best decision? The ADF collects a large amount of data that could be used to train and test/validate an algorithm, but it would need to ensure safe collection practices for data that is to be used in datasets.
Equally, bias in ML is a significant issue that can cause long-term effects on the organisation.15 Bias can be found in different parts of the algorithm development process, from the design of the algorithm to the data used to train it. Consideration must be given to methodologies to not only identify the bias but also mitigate the effects. This is particularly relevant to the potential use of ML when it comes to assisting summary authorities or superior tribunals with sentencing. An algorithm is developed with a particular outcome in mind, but the bias of those who develop the algorithm through the design process and how it is trained can affect how it produces an output. This is a commonly voiced concern within the field of lethal autonomous weapon systems.16 It further has been a subject of anxiety within the field of criminal law and sentencing. The data itself can have bias as well, as it may not have been collected specifically for training that algorithm.17 Equally, how that data came into being and the structure within which it was collected may have a bias built into it—for example, a dataset may have been developed to understand the likelihood of soldiers of a certain career length and rank committing a particular offence and deliberately exclude officers. This dataset has a larger number of soldiers who have served three years and not been promoted committing more assault. If an algorithm is trained on this data it may have a bias against soldiers who meet those criteria, regardless of many other variables.
There is opportunity in bias if it is deliberately introduced in a controlled way. Inductive bias can be used to help develop an algorithm that can deal with new situations. This is a human trait where we can come to a conclusion without knowing all the information about a situation. For a machine to do this, an inductive leap would need to be possible18 whereby it can deliberately invoke biases for choosing one generalisation of the situation over another.
A final issue with ML is the transparency and consistency of the data and algorithm that is being used. The vastness of the dataset, the complexity of the machine learning process, and the form in which the outcome is provided (with or without reasons) can make it hard to challenge legally, technically or morally.19 Unless you are a software engineer, how are you going to understand how the algorithm processed the inputs and got the decision it did? It is much harder to unpick the logic of hundreds of lines of computer code without specialist knowledge. The inherent trust placed in seeming objective, clinical AI systems creates a feeling of data sacrosanctity and undermines any perceived right of appeal. This creates a fear of the unknown and undermines trust in the validity of AI decision-making.
This issue is, moreover, compounded within the ADF, where policy makes clear that responsibility must lie with a decision-maker. This is how the redress of grievance system operates. Use of AI raises the question of where the line is at which the algorithm influences more of the decision.
These are all questions that, if ML assistance were introduced into the ADF, would require departmental or governmental positions upon. This is not, however, unique. Australian legislation is littered with delegated authority authorising the assistance of algorithms.20 While valid, the issues raised above are not fatal to updating the ADF.
Military Discipline
Sentencing, or punishment, during military disciplinary proceedings is one area that is potentially amenable to ML assistance, despite sentencing occurring through human intuitive synthesis. At the sentencing stage of a hearing, most of the relevant facts have been already established, or are readily ascertained by the sentencing figure. Luckily, with the new implementation in 2020 of a completely digital transcript (Form C2) capturing the charge, personal data, conviction, sentencing, reasons and legality, the hard work of collecting data has become easier.21
Sentencing under the Defence Force Discipline Act 1982 (Cth) (DFDA) consists of relatively few variables: the relevant authority must give consideration to civilian sentencing principles22 and to the need to maintain service discipline.23 Civilian sentencing principles include the person’s rank, age and maturity; the person’s physical and mental condition; the person’s personal history; the absence or existence in the person’s case of previous convictions for service offences, civil court offences and overseas offences; the person’s relationship with the victim (if the service offence involves a victim); the person’s behaviour before, during and after the commission of the service offence; and any consequential effects of the person’s conviction or proposed punishment. Noting that AI is underpinned by statistics, given the small dataset for Service matters there is also potential for use of civilian datasets.
These are all data points with which a sentencing algorithm, utilising ML, could assist a sentencing authority. The goal of such an algorithm would be to promote and ensure consistency—and the ADF would not be the first to use such an algorithm. Algorithms have been used in this way in the United States since 2013.24 Judges informally refer to them for guidance.25 Utilising an ADF-wide risk assessment algorithm as an aid to summary authorities and superior tribunals would help promote consistency, transparency and accountability. This would still allow the decision-maker to consider variables that were not entered into the algorithm. The data obtained could be inputted through the current practice of filling in the relevant pre-sentencing report, which outlines financial mitigating circumstances. Moreover, the algorithm could easily take into account service records, age, rank, time in rank, qualifications, previous convictions, spent convictions, and dependents.
The use of sentencing algorithms within the criminal justice system is neither novel nor unique. As noted above, bias in data is a particularly relevant concern when it comes to sentencing algorithms. A review of the publicly available data at the time of publication, with respect to superior tribunals reveals that, junior soldiers are charged, and found guilty, twice as often as non-commissioned officers (NCOs).26 One factor could be that these soldiers are younger on average and therefore prone to more risk taking; another is simply that there are more enlisted members than commissioned. Yet this data, if fed improperly to an algorithm, would suppose that soldiers were statistically more likely to offend than NCOs or officers, and could potentially sentence them in a more severe manner. This can reflect bias reinforcement in the decision-making process: if a large number of soldiers have been sentenced and the algorithm learns to target soldiers from that dataset, then it becomes a self-fulfilling prophecy. This demonstrates the ‘what you put in is what you get out’ learning issue. Whether or not junior soldiers offend more than other classes of ADF members, or whether they are charged more, would not necessarily be reflected in the sentencing algorithm.
The High Court of Australia has recently held that ‘while there may be an area of concurrent jurisdiction between civil courts and service tribunals, there is no warrant in the constitutional text for treating one as subordinate or secondary to the other’.27 This is rightly so. However, there are some distinguishing factors between civilian sentencing and military sentencing. As previously mentioned, relevant military sentencing principles include the need to maintain and enforce service discipline. This sentencing consideration provides flexibility to commanders, enabling them to exercise discretion—such as where compassionate circumstances exist, or circumstances that warrant considering the conduct as an aggravating factor in sentencing. This too is rightly so. However, it is not necessary that this discretion should remain completely unfettered. If an algorithmic approach is taken, further sentencing principles could be introduced.
Administrative Sanctions
Another branch of the military justice system is imposition of adverse administrative action.28 Adverse administrative actions are designed to admonish and correct unsatisfactory or unacceptable performance and are initiated and then managed by more senior officers. In the military justice system, disciplinary offences are specified in the DFDA and cover a range of activities or offences. There are, however, many contraventions of rules and regulations that are not punishable under the DFDA but are nonetheless subject to administrative sanctions. Defence Manual ADFP 06.1.3 notes:
Adverse administrative action is usually initiated and/or imposed when the conduct or performance of a member is below the standard expected of a particular member and is not in the interests of the ADF. It is official action that reflects formal disapproval on a temporary or permanent basis.
In determining what, if any, adverse administrative action should be taken, the merits, the circumstances and the sufficiency of evidence in each case must be reviewed. A decision whether to impose adverse administrative action depends on the seriousness of each case and the interests of the ADF. It also requires a thorough understanding of the relevant policy.
The ADF policy frameworks surrounding triggers for adverse administrative actions are wide and convoluted. Despite recent efforts to streamline the many manuals, the complexity of overlapping policy directions and constraints can lead to confusion. Take, for example, the Military Personnel Policy Manual (MILPERSMAN), the unclassified public copy of which is 745 pages. MILPERSMAN Part 4, Chapter 1 relates to the use and abuse of alcohol. This policy has differing thresholds across the three Services as to what administrative sanctions may, or must, be initiated on the basis of the number of alcohol incidents or the alcohol blood level. With respect to the Australian Army, commanding officers (COs) are given non-discretionary directions with respect to alcohol-related incidents.29 For first incidents, a notice to show cause for a formal warning may be issued; for a second incident, COs should issue a notice to show cause for a formal warning or a reduction in rank; for a third incident, COs are to issue a notice to show cause for termination.30 These notices do not automatically result in a termination decision being imposed.
From an Army perspective, the oversight of administrative sanctions at the unit level is often administered by the Adjutant—a senior Captain, whose exposure to military justice is less than that of the sub-unit and unit commanders imposing the sanctions. Navigating the policy, including knowing what conduct triggers certain non-discretionary administrative sanctions, may be difficult for the individual, notwithstanding the support of legal officers and senior commissioned and non-commissioned officers.
Here ML could further support ADF decision-makers, in a completely different way to sentencing considerations. ‘Automated systems’ can assist administrative decision-making in a number of ways—these systems can make the decision, recommend a decision to the decision-maker, or guide a user through relevant facts, legislation and policy.31 The last is most applicable here. The use of ML for automated guiding of policy is neither novel nor unique; the Australian Department of Veterans’ Affairs established a compensation claims processes system to automate certain aspects of its assessment and determination of compensation claims from veterans and their families.32 The system guides decision-makers in applying over 2,000 pages of legislation and over 9,700 different rules. The efficiency gains have been substantial. Veterans’ Affairs now determines 30 per cent more claims annually using 30 per cent fewer human resources in substantially less time, resulting in departmental savings of approximately $6 million each year.33 Accordingly, automated guidance has allowed for an increase in the overall workload of each decision-maker.
What might become harder to digitise is the notion of what ‘in the interests of the Defence Force’ means—often used as the basis for the termination of an ADF member’s service.34 Reasons for something being or not being in the interests of the Defence Force include reasons relating to one or more of the following:
- a member’s performance
- a member’s behaviour (including any convictions for criminal or service offences)
- a member’s suitability to serve:
- in the Defence Force
- in a particular role or rank
- a member’s failure to meet one or more conditions of their enlistment, appointment or promotion
- workforce planning in the Defence Force
- the effectiveness and efficiency of the Defence Force
- the morale, welfare and discipline of the Defence Force
- the reputation and community standing of the Defence Force.35
These are largely discretionary concepts, reflective of an earlier concept that service within the ADF is at the pleasure of the Crown. Accordingly, it may be that digitised triggers need to be created. These triggers could be divided into conduct that absolutely meets the concept of ‘service no longer in the interests of the Defence Force’, such as sexual-related criminal convictions, substantiated complaints of domestic violence, or high-range driving under the influence; and conduct for which termination is strongly recommended, such as theft, fraud or trust-related issues or, for Royal Australian Air Force members, prohibited substance possession.
Here ML could provide that when certain triggers are met, as defined by policy and law, decision-makers are notified of the appropriate administrative sanction that should be taken. The conduct could be entered into a decision tree, with each decision point assisting the decision-maker towards the correct policy. It could highlight the relevant policy, the discretion the commander has, where procedural fairness must be given,36 and the time frames in policy or law that must be adhered to.37 Further, it could highlight the relevant considerations that must be taken and, if refined with input from relevant case law, highlight when an irrelevant consideration has crept in.38 Such assistance to decision-makers could help minimise jurisdictional error, save costs on litigation, increase the timeliness of decision-making (benefiting both the decision-maker and those awaiting outcomes) and increase trust in the apolitical and impartial nature of the decision being made.
This could assist commanders in navigating the complex and esoteric maze of uncertainties of ADF policies. Using the ML algorithm in an assist function allows the decision-maker to deal with situations that an algorithm cannot handle. This ensures that a member is not unfairly treated as a result of their unique situation and that data can be fed back into the algorithm for future similar situations.
An example of the benefit of automated systems, and introducing trigger points for certain administrative sanctions, is shown through a hypothetical based on the facts of the recent Defence Force Discipline Appeals Tribunal decision in McCleave v Chief of Navy.39 The matter was an appeal from a decision of a Defence Force Magistrate, and concerned the alleged dishonest submission of Reserve training days by a Reserve Navy legal officer, Lieutenant Justin McCleave. LEUT McCleave claimed to have trained for three days for his mandatory awareness training, and submitted fees to that effect.40 It came out through the administrative process conducted by a paralegal, who checked to see whether the online courses had been accessed and completed, that LIEUT McCleave had failed to log into the Defence Protected Network at all on the days claimed, let alone done the work.41 After the alleged dishonest behaviour was discovered, consideration was given as to whether administrative or disciplinary action should be taken.42
A decision was made by the chain of command that no disciplinary action under the DFDA would be taken. Rather, administrative action would be taken, with a formal warning being imposed on the member. As noted above, the administrative sanctions available to the decision-maker included initiating a termination notice. The basis for this termination notice would be that LIEUT McCleave’s service was no longer in the interest of the Defence Force.
Now, disregarding the command decision to take administrative sanctions instead of disciplinary action, and the issues that may be associated with that, LIEUT McCleave’s service could, to a reasonable mind, no longer be in the interests of the Defence Force.43 This is informed not only by the accepted fact that dishonest behaviour is corrosive to the trust necessary for disciplined forces,44 and that the member was a commissioned officer who is meant to lead by example, but also by the fact that the member was a Reserve legal officer, whose dishonest and potentially fraudulent behaviour could suggest he be struck off the relevant roll.
There are benefits to taking a consistent approach to administrative sanctions and disciplinary proceedings, rather than leaving it fully within the discretion of commanders. It has been said that ‘Duty and Discipline do not march well with Discontent’.45 The retention of ADF members is not likely to be aided when they are uncertain as to whether their actions are, or are not, in the interests of the Defence Force, especially when what would appear sufficiently poor conduct to merit dismissal is met with a low-level administrative sanction. Accordingly, amendments to s 6(2) of the Defence Regulations 2016 to the effect that convictions for lack of honesty (such as fraud) or lack of control (such as assault) will automatically be viewed as not in the interests of the Defence Force may impose a more consistent decision-making approach across the three Services, whose policy guidelines may differ substantially on matters such as these. Duty and discipline may continue to march in lockstep.
Career Management
An area where automated guidance in decision-making policy could enhance decisions and assist decision-makers is career management. Career management is a highly complex system trying to effectively achieve the needs of the organisation while managing the desires of an individual, to ensure the ADF has a highly effective workforce.
By conducting a thought experiment of applying ML to the Personnel Appraisal Report process (the annual work performance assessment of a subordinate report), we can look at how ML can improve the efficiency of the appraisal system. This thought experiment also highlights ways to mitigate some of the friction points in the process.
A key element of the annual appraisal system currently in use is that it is subjective. Appraisal is usually done in the form of notes on negative and positive counselling, work outputs or completed activities outside of the member’s mandated job description. The key weakness is reliance on the diligence of the individual supervisor to maintain a record of these notes throughout the year in order to make an informed decision on the member’s performance at the end of the reporting period. Although there are tools available, such as the platoon notebook or troop commander’s notebook for Army, or COMPASS for Navy, to contemporaneously record performance, the system relies on the supervisor to use these tools effectively and apply their time equally to all subordinates. The annual cycle of the current system can be as brief as four months when supervisors and subordinates are posted in and out of positions outside of the appraisal cycle. ML works on the input of a constant supply of data, and the performance recording is a constantly updated dataset, but it is based on the individual preference for recording, which highlights a key weakness of ML—it requires the input of data in a consistent way.
A way to mitigate this weakness, and also negate the record-keeping weakness of varying individual reporting tools, is the use of an online database that requires consistent and timely small-form reporting that builds a larger picture of the individual’s performance and can be analysed by ML. The ADF human resources program PMKeyS is a database that contains data on all personnel and links managers to their reports. It also has a notification tool that ensures that manages are held to account. For this scenario the online form is replaced with a PMKeyS employee-facing interface in the electronic Personnel Appraisal Reporting (ePAR) system. The input becomes a monthly input based on the performance, potential, experience and qualifications of the individual in the reporting month. This is where other reporting, such as records of conversations, can be held. This could be achieved with very specific drop-down inputs as to how the individual is performing, and by standardising what words can and cannot be used. This would allow the ML algorithm to conduct an analysis of their performance from a quantitative and qualitative perspective. This does bring up the issue of time—it would need to be designed so that the time spent each month doing the report is less than the time spent on maintaining reporting notes and completing an annual form. The threshold number of subordinate reports that would make this unworkable would need to be understood. If a manager has 10 reports and spends 15 minutes per person a month to counsel and complete the report, totalling 30 hours a year— is that comparatively less than the time spent on interim and annual reports? Time and quality indicates value. But is the extra time spent on reporting providing a more effective workforce through more accurate management?
The monthly appraisal could give better quality data on a person’s performance across a defined period as it is recorded closer to each reportable instance like project completion, poor performance or improvement. This could also increase the actual reporting on an individual, as a monthly input means there are only two months (December and January) when a report would not need to be raised. The use of a predefined system also allows those who are not the direct supervisor to raise part of the appraisal. For example, if the member is attached to an external unit for two months, the external supervisor can raise the report and the member’s home unit supervisor can see the report and approve it through PMKeyS.
This helps the manager provide a more accurate appraisal of the individual, as they do not have to try to build a picture from their own notes (or those of others) over a long period. If the appraisal is done month to month, the ML algorithm can moderate the individual across the years. In the annualised report, if a person has a poor few months before their appraisal, the supervisor may be biased by that poor performance instead of accurately weighing all the information recorded. There is a danger that subordinates become too focused on the reporting and try to appear more effective before the report instead of just consistently doing good work.
The second element is the consistency of this information being applied to the defined merit system. ML can assist to correctly identify performance improvements across the career management cycle by moderating more information across an entire rank performance time frame. The average captain time in rank is six years. The last five reports are used to determine their suitability for promotion. A poor report could impact their assessment, as each report has greater weight. Although there is human moderation, this is once again influenced by the amount of information decision-makers are able to assess in the time frame they have and the support from the chain of command they may be able to get.
An ML algorithm given monthly reporting inputs could conduct moderation quickly using more information for a Personnel Advisory Committee (PAC) overseeing an ADF member. A Captain could have 60 monthly reports of various levels across their career; this data can be used to show their trajectory of performance in greater detail in both a negative and a positive way. More detailed and accurate information on anomalies can be found, and the algorithm can even track how people report. This would provide more consistency on an individual’s suitability to promote, since the ML algorithm can assess the higher fidelity reporting and provide sound decision support to the PAC. Through the use of context algorithms ML can target key words and phrases that help delineate candidates who are similar.
As shown above, ML can be used to reduce the time to process appraisals and provide an accurate picture of personnel. The fidelity of the information on personnel is increased as a strong dataset is developed of each individual. This expedites the process, allowing for faster analysis and recommendations.
The process described above could allow for real-time changes as the needs or expectations of the ADF member evolve. These expectations could be included as data points for analysis. These data points could be overlain with the organisational plots to create an algorithm that can learn to place people in the best locations for Defence and for the member. Optimising Defence capability is the primary goal; however, minimising disruption may reduce personnel separation rates.
This would also help manage expectations. Individuals know that their preferences are plugged into a machine, removing an element of human bias that occurs in the current system. This ML-assisted decision-making could reduce the resources required to manage the posting plot.
Conclusion
These examples do not posit that ML has reached a stage where it would be advisable for an automated system to make the decision for the ADF. Yet it could very readily assist a decision-maker, either through recommendations or through acting as a guide for policy and law. This reduces the risk that decision-makers will blindly rely on ML for their decision-making because they do not want to take (or do not have) the time to ensure the decision is correct. It could thereby lead to a circumstance where a decision-maker, though not being able to query the decision inputs, is still held responsible for it and the outcome.
Consistency in decision-making is an issue not only in the ADF but also more broadly in public and administrative law.46 By standardising the tests used for administrative sanctions, sentencing, or postings and promotions, the more the algorithm is used, the more data is available. The application of ML could foreseeably result in quicker, more consistent decisions across the ADF. But to ensure the integrity of the system, any attempt to improve the standard and quality of decision-making must be tested.
Machine learning has significant potential to enhance the decision-making of the ADF, through reducing the cognitive clutter that an individual must sift through to reach an informed decision. The military justice system can benefit through allowing lay summary authorities to have consistent sentencing considerations, through a tool that has been utilised by qualified judicial officers in the wider community. Career management could have more useful information on, and greater fidelity with, how members are assessed, which would support decisions selecting the best personnel for future positions in the ADF.
There are a number of issues that must be considered when looking at how ML could be used to support decision-making that could have detrimental or unintended consequences. Utilising ML within Defence may require legislative support, and will require comprehensive policy surrounding accountability and ownership. There will need to be development of policy on how reviews are conducted, especially around decision accountability when there is a new grey area of how decision support is provided. This will take time as ML tools become more widely used. It could lead to the use of systems to review the systems, as there will be a point where a qualified person cannot review the code.
If implemented correctly and with due consideration of potential pitfalls, the use of algorithms to help synthesise information in various administrative and disciplinarian functions could create a more efficient, more transparent and fairer system for the ADF.
About the Author
Captain Samuel White has served both as a Royal Australian Infantry Corps and an Australian Army Legal Corps officer. In 2018, he was appointed Associate to Justice John Logan RFD of the Federal Court of Australia and current President of the Defence Force Discipline Appeals Tribunal. He is currently posted to the Directorate of Operations and International Law.
Endnotes
1 Igoe v Major General Michael Ryan in his capacity as a Reviewing Authority (No. 2) [2020] FCA 1091 at [1] (Logan J).
2 Australian Army, Royal Australian Infantry Corps, 19 December 2016, at: https://www. army.gov.au/our-people/corps/royal-australian-infantry-corps
3 Monika Zalnieriute, Lyria Bennett Moses and George Williams, 2019, ‘The Rule of Law and Automation of Government Decision-Making’, Modern Law Review 82, no. 3: 425.
4 Deyi Li and Yi Du, 2017, Artificial intelligence with Uncertainty (CRC Press).
5 R Kocielnik, S Amershi and PN Bennett, 2019, ‘Will You Accept an Imperfect AI? Exploring Designs for Adjusting End-User Expectations of AI Systems’, Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, 1–14.
6 S Makridakis, E Spiliotis and V Assimakopoulos, 2018, ‘Statistical and Machine Learning Forecasting Methods: Concerns and Ways Forward’, PLOS One, 13, no. 3.
7 Dominique Hogan-Doran, 2017, ‘Computer Says ‘No’: Algorithms and Artificial Intelligence in Government Decision Making’, The Judicial Review 13: 1–39. See also Australian Government, 2007, Automated Assistance in Administrative Decision-Making: Better Practice Guide, 4.
8 M De Choudhury, S Counts, E Horvitz and A Hoff, 2014, in Proceedings of the Eighth International AAAI Conference on Weblogs and Social Media (Palo Alto, CA: The AAAI Press).
9 S Suthaharan, 2014, ‘Big Data Classification: Problems and Challenges in Network Intrusion Prediction with Machine Learning’, ACM SIGMETRICS Performance Evaluation Review 41, no. 4: 70-73.
10 ML Concepts, Google Crash Course, at: https://developers.google.com/machine-learning/crash-course/classificat…
11 H Bhavsar and A Ganatra, 2012, ‘A Comparative Study of Training Algorithms for Supervised Machine Learning’, International Journal of Soft Computing and Engineering 2, no. 4: 2231–2307.
12. M Alloghani, D Al-Jumeily, J Mustafina, A Hussain, and AJ Aljaaf, 2020, ‘A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science’, in MW Berry, J Azlinah and BW Yap (eds), Supervised and Unsupervised Learning for Data Science (Cham: Springer), 3–21.
13 Anon., 2013, ‘Kahneman and Klein on Expertise’, Judgment and Decision Making (blog), at: https://j-dm.org/archives/793
14 E Ronchi and D Nilsson, 2013, ‘Fire Evacuation in High-Rise Buildings: A Review of Human Behaviour and Modelling Research’, Fire Science Review 2, no. 1: 7.
15 Suraj Acharya, 2019, ‘Tackling Bias in Machine Learning’, Insight Data Science (blog), at: https://blog.insightdatascience.com/tackling-discrimination-in-machine-…- 5c95fde95e95
16 International Committee of the Red Cross, ‘Autonomy, Artificial Intelligence and Robotics: Technical Aspects of Human Control, ICRC website, at: https://www.icrc.org/en/ document/autonomy-artificial-intelligence-and-robotics-technical-aspects-human-control
17 A Torralba and AA Efros, 2011, ‘Unbiased Look at Dataset Bias’, CVPR 2011, 1521–1528.
18 TM Mitchell, 1980, The Need for Biases in Learning Generalizations (New Brunswick, NJ: Department of Computer Science, Laboratory for Computer Science Research, Rutgers University), 184–191.
19 See Administrative Review Council, 2004, Automated Assistance in Administrative Decision Making: Report to the Attorney-General, Report No. 46 (Canberra: Commonwealth of Australia). See also, M Perry and A Smith, 2014, ‘iDecide: The Legal Implications of Automated Decision Making’, Federal Judicial Scholarship 17.
20 Social Security (Administration) Act 1999 (Cth) s 6A; Therapeutic Goods Act 1989 (Cth) s 7C; Migration Act 1958 (Cth), s 495A. See more generally Samuel White, 2021, ‘Review of Delegated Automated Decision Making’, AIAL Forum 101(3).
21 This system came into effect with the new Summary Authority Rules 2019 (Cth).
22 Defence Force Discipline Act 1982 (Cth), s 70(1)(a).
23 Ibid, s 70(1)(b).
24 State of Wisconsin v Loomis 881 N.W.2d 749 (Wis. 2016). Since 2013, several states have now banned or prohibited the use of such systems due to the inability to correct for racial bias.
25 Jordan Hyatt and Steven L Chanenson, 2017, The Use of Risk Assessment at Sentencing: Implications for Research and Policy, Villanova Law/Public Policy Research Paper No. 2017-1040, 33–54.
26 Office of the Judge Advocate General, Department of Defence website, at: https:// www.defence.gov.au/jag/Court-Martial-Magistrate-Proceedings.asp
27 Private R v Cowen [2020] HCA 31 at [51] (Kiefel CJ, Bell and Keane JJ).
28 See, for example, David Letts and Rob McLaughlin, 2019, ‘Intersection of Military Law and Civil Law’, in Robin Creyke, Dale Stephens and Peter Sutherland (eds) Military Law in Australia (The Federation Press), 100.
29 MILPERSMAN Part 4, Chapter 1, Annex 1J.
30 Ibid, [30].
31 Hogan-Doran, 2017.
32 Department of Human Services, 2013, 2012–13 Annual Report (Canberra: Commonwealth of Australia), 68, 69.
33 John McMillan 2007, ‘Automated Assistance to Administrative Decision Making: Launch of the Better Practice Guide’, presentation, Institute of Public Administration of Australia, Canberra, 23 April 2007), 10.
34 Defence Regulation 2016 (Cth), s 24(1)(c).
35 Ibid., s 6(2).
36 Ibid., s 24(3).
37 Such as the requirement in Defence Regulation 2016 (Cth), s 24(2) to give 14 days to respond to a termination notice.
38 If such an approach had been taken, the jurisdictional error in Martincevic v Commonwealth (2007) 164 FCR 45 may have been avoided.
39 McCleave v Chief of Navy (2019) ADFDAT 1 (Hiley and Garde JJ).
40 Ibid., [22].
41 Ibid., [40]–[69].
42 Ibid.
43 Justice John Logan, 2018, ‘Administrative Discharge in Lieu of Military Disciplinary Proceedings—Supportive or Subversive of a Military Justice System?’, presentation, Queensland Tri-service Reserve Legal Officers’ Panel Training Day, Brisbane, 16 November 2018.
44 See for two confirmations of the fact Stuart v Sanderson (2000) 100 FCR 150 at [41] (Madgwick J); and Green v Chief of Army [2011] ADFDAT 2,
45 This is, admittedly, a very large and overlapping area for consideration, noting that ‘Duty and Discipline do not march well with Discontent’ (Marks v The Commonwealth (1964) 111 CLR 549, 575 (Windeyer J)).
46 See S Lohr, ‘If Algorithms Know All, How Much Should Humans Help?’, The New York Times, 6 April 2015, accessed 23 August 2017, at: https://nyti.ms/1MXHcMW; see also D Schartum, 2016, ‘Law and Algorithms in the Public Domain’, Nordic Journal of Applied Ethics 10, No. 1: 15–26.