The Diffusion of ICT for Corruption Detection in Open Government Data

a Accounting Research Institute, University Teknologi Mara, Malaysia, Level 12, Menara Sultan Abdul Aziz Shah, Universiti Teknologi MARA, 40450 Shah Alam, Selangor, Malaysia b Department of Technology, Policy and Management, Delft University of Technology Building 31 Jaffalaan 5, Delft and 2628 BX Delft, Netherlands c Graduate School of Economics and Management Ural Federal University, Russia. Lenin Ave, 51, Yekaterinburg, Sverdlovskaya oblast', 620075, Russia


I. Introduction
Recently, open data has become popular due to the drastic growth of information technology [1].Government agencies, state own companies, nonprofit organisation have started initiatives to open their data to enhance the transparency and accountability toward the stakeholder [1].Open data enables public to access data freely, subsequently monitor and participate in government activities.Some developing countries have already implemented open data for fighting corruption.
Information and Communication Technology (ICT) development enable the opening of data to create transparency and have the potential to create anti-corruption tools.[2] argues that there are five significant ways in which ICT can help reducing corruption risks: Raising awareness of specific governance problems (types of corruption); Providing low-cost online platforms to monitor and promote more inclusive, transaction and accountable decision-making.As a result it can reduce the cost of distribution, accessing and collecting government information [3]; Reduce the incentives for corruption by reducing the direct contact and familiarity between end-users and decision-makers; enabling the more effective control of financial transactions that may put the integrity of politically exposed agents (individual or collective) at stake; and Speeding up public awareness for anticorruption campaigns.

ARTICLE INFO A B S T R A C T
Corruption is a problem in the public and private sector and came in many shapes and forms including: bribery [4], embezzlement [5], theft [6], extortion [7], abuse of power [8], discretion [9], favoritism [10], conflicting interest [11], and improper political contribution [12].The consequences of corruptions are diverse.Corruption can harm society and result in increased poverty, diminishing money available for essential government services, destroy citizen trust in government and undermining economic growth The open data has been used as a mechanism to fight against corruption has been successfully implemented.The original mission of open data is to enhance credibility, the democratisation of government decisions, participation and the promotion of a global culture of transparency and accountability.For example, Timor Leste a small island close to Indonesia with 1 million inhabitants is using financial data to monitor and control corruption [13].Thus, this study aims to investigate how open data can be used to detect corruption.
This paper explores the literature on ICT development and open data as a mechanism to fight against corruption.In this paper, the institutional setting will be taken into account, but the change of the culture is outside the scope.The following section of this research provides background information regarding the developments including open government data (OGD), data-driven detection of corruption and overview of stakeholder.Next, the challenge for data-driven corruption detection.After that, the problem statement and research objective are presented.Finally, this proposal discusses the research phases and research method.

II. Methods
This study review past related literature on the role of open data in mitigating corruption.The literature review is focused on 1) open government data 2) information architecture and 3) corruption detection.The determined critical concepts for the literature reviews include open government data, corruption, and accountability.Articles were found by using Scopus, JSTOR, ACM, Digital Library, and Google Scholar/l Snowballing will be used examining the citations in the identified articles and adding these articles.
A literature review is one of the critical elements for research and generally focus as fundamental in a research project [14].Webster and Watson's 2002 study [15] stated that new researcher uses the method of the literature review is as nothing more than grouping some papers and summaries it or collaborated the multiple research manuscript of annotated bibliography.The definition of literature review for various scholars mentioned as "the use of ideas in the literature to justify the particular approach to the topic, the selection of methods, and demonstration that this research contributes something new" Hart's study cited in [15].Hart (1998) also noted that for the literature review, "quality means appropriated breadth and depth, rigour and consistency, clarity and brevity and effective analysis and synthesis" cited in [15].[15] in reporting Shaw's study emphasised that the aspect of a literature review should "explain how one piece of research builds on another".In line with the above definition, Webster and Watson's [15] (2002) define the literature review as one that "creates a firm foundation for advancing knowledge.It facilitates theory development, closes areas where a plethora of research exists, and uncovers areas where research is needed".According to [15], an effective literature review should follow the following steps.First, conducting methodological analysis and synthesizing the quality of the literature.Secondly, providing a firm foundation for a research topic.Thirdly, providing a firm foundation to the selection of research methodology, and finally, demonstrating the usefulness of the proposed research to the overall body of knowledge or advancing the research field's knowledge-base.
Figure 1 described the three-step process to propose a literature review.The process contains the following steps; 1) Inputs, 2) Processing and 3) Outputs.It gives an overall view of the process proposed.This proposal is using a literature review as the first step to define the critical constructs of study as well as to identify factors which influence open data to detect corruption.Webster and Watson's 2002 [15] stated that a useful and quality literature review is a review that is based upon a concept-centric approach rather than chronological or author-centric approach.Researchers must ask themselves when they are reviewing and writing the literature review and see if the presented articles are related or not to the study [15].

III. Results and Discussion
This part presents a global overview of how ICT especially open data can be used for mitigating corruption.

A. Opening the Government Data
The former President of United States of America, Barrack Obama, stated in a memorandum of Transparency and Open Government that the government should ensure the public trust and establish a system of transparency, public participation, and collaboration [16].Openness will strengthen our democracy and promote efficiency and effectiveness in government.The European Commission also state that the availability of raw data and document in various readable format and languages may maximise the re-useability value of public sector information (PSI) (European Commission, 2010, p.5).
Therefore, the government should provide the most practical data for users.Data should be available, for free, over the Internet in open, structured, machine-readable formats to anyone who wants to use it [17].According to [18] open data is freely accessed internet data, can be to re-used without limitation.[19] said that everyone can freely see, use and inform the open data to others.Open data can be freely used, re-used and distributed by anyone and make their work available to be shared as well [20].These definitions show that the main characteristics of open data are data (re)use, machine-readable and access.Machine-readable should ensure that a massive amount of data can be processed automatically.Open data has the potential to support the detecting of corruption; however, how this can be done is not known yet.Data collection, current systems, administrative processes and institutional arrangements might need to be changed.A reference information architecture capturing administrative processes, data, and software systems and organizational principles can help to develop systems to support the detecting of corruption.
Open data has become an important and growing topic in many developing countries [21] due to improve transparency, accountability and citizen participation [22].[22] argue that there are five primary drivers of open data initiative in developing countries.First, an open data initiative can be motivated by politicians to improve the information flow within the government and with other stakeholders to reduce administrative burden, costs and inefficiencies.Second, the increase of accountability may strengthen the government political policies by giving better information on regional, local or sectoral government activities.The third driver for open data initiative is the pressure from civil society, media, and parliamentarians or private companies.The fourth is international pressures to create data transparency.The last driver is the government motives to gain reputation from transparency.

B. ICT for Corruption Mitigation
This part presents a global overview of how ICT can be used for mitigating corruption.There are three types of indicator in ICT such as ICT access, ICT use and ICT skill.We only focus on ICT access which is consists of five proxies.For example, fixed telephone subscriptions, mobile phone, Fig. 1.The three stages of effective literature review process [15] internet bandwidth, the computer used, and internet access.[2] argues that there are five major ways in which ICT can help reducing corruption risks: 1) Raising awareness of specific governance problems (types of corruption); 2) Providing low-cost online platforms to monitor and promote more comprehensive transaction and accountable decision making.As a result it can reduce the cost of distribution, accessing and collecting government information [3]; 3) Reduce the incentives for corruption by reducing the direct contact and familiarity between end-users and decision-makers; 4) Enabling the more effective control of financial transactions that may put the integrity of politically exposed agents (individual or collective) at stake; and 5) Speeding up public awareness for anticorruption campaigns.
ICT of Information and Communications technology is referred to actions that provide connecting to information through other electronic technologies such as, Internet, wireless networks, cell phones and other mediums communication [23] [24].Some prior studies argue that ICT has a positive relationship in fighting corruption and improve quality governance.Previous studies indicated that ICT is able to provide countries a new method to creating transparency and indorsing anti-corruption [3][25] [26][27] [28][29][30] [31].Subsequently, a high number of countries try to implement and connected transparency with ICT-based initiative for example through e-government [32].
Furthermore, according to [33], in India ICT is success support quality of governance.ICT support the decision of public administrators to improve planning and monitoring programs for more transparent public services through access to information and knowledge.For instance, the use of GIS (Geographical information system) for planning location of rural facilities or identifying the disaster area.Another example is the use of telephone to develop socioeconomic development [34].People can reduce their communication cost since the telephone can minimise the number of communicating links among some parties.
Following that, many ICT literatures commented on the implementation of ICT approach to reducing corruption.For instance, [30] ICT can promote to reduce corruption by implemented good governance, strengthening reform-oriented initiatives, reducing the potential for corrupt actions, improving the connection between citizens and public employee, permit the citizen to follow their activities, monitor and control the public employee actions.Also, [35] stated that, to success in reducing, the ICT initiatives must be changed to disclose the information.As a result, the societies, NGO, researcher, politicians can track the decisions and action that lead by a government employee.
In the same time, some governments see the implementation of ICT as a resource to encourage efficiency and transparency in the same [36].In general, ICT envision as a practical tool to reduce corruption, although social culture can reduce the effectiveness of ICT as anti-corruption [29].Statistical analyses and cases studies show that ICT is demonstrating a great deal in reducing corruption.Especially, ICT can improve the effectiveness of managerial and internal control ended by fraudulent behaviour as well as endorsing the transparency and accountability on governments [30].A study by [37] regarding the analysing of corruption data over ICT-enable e-government initiatives concluded that corruption could be reduced significantly by implementing e-government, "even after controlling for any propensity for corrupt governments to be more or less aggressive in adopting e-government initiatives" [37] p. 210.
Other studies examined the successes of e-government in reducing corruption in some countries like Americans, Europe and Asia [21] [30].The most prominent successes solution for fighting corruption by e-government in the area of taxes and government contracts [12].For examples, in India, by provided the property record online in a rural area has significantly improved the speed at which the records are retrieved and updated, whereas, in the same time erasing the opportunities for the local employee to receive bribes as had previously been widespread [21].The property records online or the Bhoomi electronic land record system at Karnataka, India is approximate to have saved 7 million in bribes to the local employee in its first several years.Before the implemented the system, it required Rs.100 to transfer to a local employee for bribes, though the electronic system only takes Rs.2 [38].
Similarly, in Pakistan, by using e-government for tax transaction.The government was restructured all tax system and the department to e-government structure.The aims are to decrease the face to face contact between tax employee and citizen to decreasing chances for requests for bribes [37].Also, in the Philippines Department of budget and management, they create an e-procurement system for bidding on government contracts to both avoid price fixing and provide accountability on the public.In the same way, in Chile nation, e-procurement system was established to provide public and citizens to see and compare the cost and services of the bids purchased by the government.The e-procurement system provides 500 outsourced services from more than 6,000 providers [30].The system approximately gives advantages $150 million US per year by avoiding price fixing or inflation by corrupt officials and contractors.Not to mention, this new system gives a positive contribution to reducing corruption and allow small business to participate in the government bidding process [39].
As a matter of fact, in Fiji, the successful of e-government to reducing corruption has built positive public perception of government corruption.As a result, it improves the responsiveness of public employee to provide better services to citizens [40].Coupled with, the United State has establishing web sites that permit access to the government expenditure data, for instance (recovery-gov) general funds (USA spending-gov), and information technology funds (IT USA spending-gov) web sites, the purpose is involving society to control the government spending for earlier identification and removal of wasteful projects [16].Some states in the US adopted similar websites involving the citizens to control and monitor government spending for waste and fraud.Additionally, some U.S websites on government permit for tracking of transaction therefore that it is possible to monitor the progress of one's application or a request from government services.For example, the U.S Customs and Immigration Services (USCIS) give access to immigrants to check their application progress.
Similarly, with U.S Department of State allow passport seekers to check the progress of their application.This service allows a significant number of the user (such as citizens, residents, immigrants) to monitor the progress of their application through online services.As a result, it saves the time for the user, efficiency and offers reasonable timeframes for processes of some services, documents and resources [12].
Thus, [2] argues that some conditions should be in places to make ICT as a weapon to fight against corruption like training of public officials, and institutional arrangements.The latter is necessary for the ability to further analysis and take interventions after spotting possible corruption.Since corruption is rooted in culture ICT alone is not sufficient to reduce corruption they should collaborate with public governance, institution, media and society [2].

C. Open Data-driven corruption detection
More data is available that can be used to detect possible corruption.The availability of a large volume of data is often named 'big data' [41].Corruption is often hard to detect and to observe.Anomalies, outliers or changes in the patterns might be a sign of corruption.It is likely that multiple data sources need to be combined with being able to detect corruption.Each of these sources might provide complementary insight.Different sources might not be consistent or show a different picture of the situation, which makes analysis difficult.Differences might be a sign for corruption, but can also due to problems in data collection.
The availability of data is related to the datafication.The term of datafication can be defined as "the ability to quantify all sorts of information into machine-readable data format" Datification resulted in the need for the development of new capabilities to handle the large volume of data [42].The use of sensors of the internet of Things (IoT) enables the collection of data at the source.By collecting data directly, the chance of data manipulations is reduced.The creation of new content, connectivity, analysis software and infrastructure may continuously evolve the datafication.One of the most changes is the transformation data from closed data to the open, interconnected world where the traditional roles of, and the relation between, sectors are changing [18].Example of IoT measures of pollution, video cameras, weighing goods, counting the number of cars (or people) passing and so on.This kind of information might be useful for detecting corruption.For example, more cars might have been passed than toll has been paid.Data-driven corruption detection is a complex process in which data need to be collected, processed, analyzed and acted upon.In this process, many stakeholders can be involved.Figure 2 shows schematically how corruption can be detected by collecting and using data.Firstly, the data is collected from the process under scrutiny in which various actors can play a role.Different types of data can be collected at various points in time and from multiple stakeholders.Data can be collected in different areas, including budgeting, actual spending, policy programmers and so on.The categories of data that is collected likely determine if it is possible to detect corruption.

D. The information architecture, modelling, and principles
Mechanisms for information sharing needs to be in place to detect corruption.A public organisation needs guidance to develop systems and processes for disclosing data to detect corruption.In this research, the focus is on developing a reference architecture to be able to detect corruption.By using reference information, architecture organisations can develop a system that can be used to detect corruption.
There are various views on what constitutes an information architecture.In general, an information architecture (IA) is a coherent whole of principles and models capturing elements like organisational structure, business processes, data, and IT infrastructure.It is a formal description of a system, or a detailed plan of the system at component level to guide its implementation as well as the structure of components, their inter-relationships, and the principles and guidelines are governing their design and evolution ever time [43].
Furthermore, the reference architecture is defined as a set of principal guidance for implementation and a system structure and components, is simultaneously applicable to multiple related of specific system with explicit variation (Taylor, Medvidovic, & Dashofy, 2009, p.58).[43] explains the advantages of using a reference architecture as follows.
1. Accelerating the whole or partial system analysis.2. Improving the reusability and connectivity.

Decreasing mistakes and error.
Information architecture should provide a blueprint of the desired situation and an overall plan regarding the implementation.The architecture can be used by stakeholders to make decisions regarding the development strategies system.Software architecture, enterprise architecture and reference architecture differ in many ways; their generality and their scope [44].A reference architecture is a generic architecture, while enterprise architecture and software architecture are specific architectures for a situation.The target of enterprise architecture is the enterprise as a whole.While software architecture is only for specific solution architecture in their scope.Information architecture describes the relationship between the business process, application and information source aim at storing, processing, reusing and distributing of information across information resources.In other words, information architecture is the organisation of information to aid information sharing among actors.However, reference architecture can be applying in all levels both solutions and enterprise [43].What a reference architecture should contain varies in the literature.We follow [43] a reference architecture includes 1) architectural principles, 2) implementation guidelines, and 3) system structures and components.
On the other hand, [45] categorise the contents of a reference architecture into three separate classifications, i.e. 1) customer context, 2) business architecture, and 3) technical architecture.Customer context consists of customer enterprises and users as well as their interaction; Technical architecture, which provides solutions in technology, includes design patterns technology; business context comprises of a business model and life cycle.According to [46], referential architecture has three primary elements: (1).Statements of technical positions that guide system architects in making technical decisions; (2) An organisation-wide consolidated infrastructure blueprint that gives a blueprint for the overall infrastructure and shows how various enterprise components are hooked together, and (3) Individual reference architectures for specific types of systems.Despite the different contents given by various authors, there are some commonalities that should be included in a reference architecture: (1) Architecture should be both descriptive and prescriptive (blueprint) which describes system structure and components as well as their interaction; and, (2) Practical guidance for implementation (principles, guidelines, or technical positions).There is a variation of contents by inserting business context and customer context into a reference architecture.In our work, we consider a reference architecture as a perspective information architecture describing the highest level of abstraction of a system In term of modelling languages, there are various languages for architecture modelling such as Archimate and BPMN.The Archimate language contains concepts for describing the relationships between architecture descriptions at the business, application, and technology levels.It plays a central role, related to the ubiquitous problem of business-ICT alignment.ArchiMate conforms to existing languages or standards, such as Unified Modelling Language (UML) for each architectural domain, [47].
The Business Process Modelling Notation (BPMN) is a standard for modelling processes, which gives unambiguous symbols and constructs for mapping out processes resulting in simple communicative models [48].BPMN will be used for mapping business processes in which the responsible entity for executing tasks can be modelled using in the swim lane.The connection among organisation can be designed by exchanging information, with the modeller choosing the aggregation level at which a service is specified.This allows a link among the business processes, organisational responsibilities, data stored and exchanged related to (sub) processes and tasks.Our focus is on data sources, information exchange and business processes; therefore, BPMN will be adopted as the modelling language.Furthermore, architecture principles can be defined as must be followed rules, emphasises "doing the right things" and expected to give significant improvement [44].Guidelines are supporting practical guides which often cannot wholly be followed and need trade-offs; system structures are levels of structures of the system that expected to satisfy the requirements; system components are components of the system that expected to satisfy the requirements.
Part of the reference architecture is architectural principles.Principles are defined as rules which have to be followed, emphasises as well as to provide a significant improvement [49].On the other word, principles is a normative reusable and directive statement that guide architects in designing the capabilities needed to achieve overarching goals.The use of principles to solve complex or structural problems cannot be formulated in clear and quantitative terms and computational techniques [47].The use of principle based design (PBD) is suitable for Information System (IS) design in multi-tasks environments [48] such as facilitating different sets of goals and process of various users; unfamiliar events and processes in non-routine task environments; different audiences (architects, IT developers, IT auditors, system managers and operators); deep uncertainty; and a wide range of dynamic-technical solutions and alternatives.The principle can give guidance during the multi-level of the design process and architects IT auditors can use it as a checklist for the evaluation of the existing information system.The implementation of the reference information architecture is aimed at enabling the design of information architectures that allow corruption detection

IV. Conclusion
Open data is one of the tools for fighting corruption; opening up government data to the public is an excellent strategy to control the public and private government.The mission of open data is to improve credibility, democratisation, transparency and accountability of government decisions.Also, there is some requirement to be in places to make ICT development to be a tool to fighting corruption, for example, training the official staff and institution revolution, ICT itself is not sufficient to reduce corruption, they must work with public government NGOs, media and society.
In this paper, we consider creating a framework of reference architecture as a viewpoint information architecture describing the top level of abstraction of a system.Future research will have to show whether this framework can be applying in the case study.The practical contribution is aimed at helping the government to detect corruption by using a data-driven approach.The scientific contribution will originate from the development of a reference architecture and architectures principles which enable the design of information architectures enabling the detecting of corruption.