Alex Hawala - IB Math Exploration - Benford's Law

September 14, 2017 | Author: Alex Hawala | Category: Fraud, Statistics, Physics & Mathematics, Mathematics, Science
Share Embed Donate


Short Description

THis is an...

Description

Benford’s Law Word count: 1179 words Candidate Name: Alex Evat Lineekela Hawala Candidate Number: 0015 School Number: 001179 Maths Exploration Windhoek International School

Candidate Name: Alex Evat Lineekela Hawala Candidate number: 0015

Rationale I wrote the Math Exploration on Benford’s Law as I wanted to find a statistical concept that could be applied to different aspects of mathematics. The topic was also chosen to explore what kind of data sets it could be applied to and also where in can be used in the real world. I used the Fibonacci sequence as a proof because I wanted to find another mathematical concept where Benford’s Law could be applied. I then used data from the Namibia Statistics Agency so that I would be able to test the Law on a random set of data. As for what it could be used for in the real world, the case of the Arizona Treasury manager was an aspect of financial forensics that was useful in my investigation. In this investigation I was able to utilise mathematical concepts such as logarithms and statistics.

1

Candidate Name: Alex Evat Lineekela Hawala Candidate number: 0015

Introduction Statistics are a part of mathematics that use numerical data in order to identify trends and patterns. These trends and patterns can be used to make predictions that can be used to solve problems. An aspect of statistics that will be discussed in this mathematical investigation is Benford’s law, stated by Frank Benford in 1938. Benford’s law refers to the frequency of the first digit in numbers in many sets of data. Benford’s law states that in a set of data, numbers that have the first digit as 1 will occur the most. In this mathematical investigation I will first fully explain the concept of Benford’s Law. Afterwards, I will use data from The Namibia Statistics Agency on Buildings Completed, and Fibonacci’s Sequence to prove Benford’s law. I will then provide an example on how the Statistical law is used in the real world in financial forensics, particularly in the detection of fraud. From what I have discovered in the Investigation I will make a sound conclusion regarding Benford’s Law.

Background

2

Candidate Name: Alex Evat Lineekela Hawala Candidate number: 0015 Benford’s Law was first stated by Simon Newcomb in 1881, but was popularized by Frank Benford, who later stated the law in 1938. Bedford stated the Law after using data sets from numerous sources from the surface area of rivers, death rates, and telephone numbers. Benford found that the number of digits that begin with the number one occurred around 30% of the time, with while the number two occurred 17% of the time. As the Figure 1

numbers of the starting

digits increased, the frequency of a number occurring in that number would decrease, therefore implying that the number of figures starting with the digit nine would occur much less than that of any other number in a data set, as opposed to the thought that any number from 1 to 9 would have an equal change of occurring in a set of data. Figure 1 represents the distribution of this data. Simon Newcomb had calculated this distribution with the formula: log 10 (

n+1 ) n

Benford’s law also has uses in the real world not only as statistical phenomenon but also as a method of detecting fraud in statistical forensics. For example, an Arizona bank manager was accused of committing cheque fraud in 1993. Figure 2 is a list of the transactions made by the manager. Under scrutiny one notices that most of the digits begin with an 8 or 9 in its values. This is the manager’s first mistake, as his list of transactions would then have a low correlation towards Benford’s Law. Most of the 3

Candidate Name: Alex Evat Lineekela Hawala Candidate number: 0015 values in the data are values closely below US$ 100 000, which would act as a threshold for the data. This was probably because the perpetrator avoided any transactions underneath the value as it would have prompted a human signature instead of the automated transfer using the Treasury’s computer system. Figure 2

However,

the distribution does not work in this manner. Humans do not assume that some numbers more frequently than others, and assume that numbers would tend to have random frequencies. Figure 3 displays the frequency of the values in the data set. As shown in the example, Benford’s Law can detect the fraudulent distribution of data in financial statements. However, in the analysis I will investigate whether it applies to other statistical data and set of data.

Analysis The Fibonacci sequence will be used as an example of how Benford’s Law applies to any set of data. The Fibonacci sequence is a progressive sequence that begins with the number zero, followed by one. The third term is derived from the sum of the two previous terms, 0 and 1, which would equal 1. The next terms follow this pattern as well, making the sequence:

{ 0,1,1,2,3,5,8,13,21,34 … n } .

The Fibonacci sequence was used to prove Benford’s Law in naturally occurring sequences by calculating the first 200 numbers in the sequence using Wolfram Alpha, a powerful computational knowledge engine that can be found on the internet.

4

Candidate Name: Alex Evat Lineekela Hawala Candidate number: 0015 I then took the numbers from the sequence, and counted how many of numbers in the sequence began with the digits 1-9 respectfully. I calculated the frequencies of the data, using my Graphical Display Calculator as shown in Table 1, which displays the frequencies of the first digits in the data set. Number of occurrences 60 36 25 18 17 12 11 12 9 200

First Digit 1 2 3 4 5 6 7 8 9 Total

Frequency 30.0% 17.5% 12.5% 9.0% 8.5% 6.0% 6.0% 6.0% 4.5% 100%

Table 1

Frequency of the first digits in the Fibonacci sequence 35.0% 30.0% 25.0% 20.0%

100.0% 200.0% 300.0% 400.0% 500.0% 600.0% 700.0% 800.0% 900.0% 30.1% 30.0% 17.6%

Frequecy of first digit

18.0%

15.0%

f(x) = log base 10 (n+1/n)

12.5% 12.5%

10.0%

9.0%

5.0% 0.0%

9.7%

1

2

3

4

7.9% 8.5%

6.7%

5.8%

6.0%

5.5%

6

7

5

5.1% 6.0% 8

4.6% 4.5% 9

(n) First digit of number

Figure 3 plots the

Figure 3

frequencies with the graph of

f ( x )=log 10(

n+1 ) n

to examine a correlation

between the two. 5

Candidate Name: Alex Evat Lineekela Hawala Candidate number: 0015 The data from the first 200 numbers from the Fibonacci sequence seemed to follow Benford’s Law very closely, proving its usefulness in determining the distribution in natural mathematical sequences. I then continued the investigation by using a set of data that has not been tested against Benford’s Law. The data in question is from the Namibia Statistics Agency, in a report named Monthly Building Report: January 2015. The report contains the indices on Buildings Completed in Windhoek, Swakopmund, Walvis Bay, and Ongwediva in Namibia from in the time period of January 2010 to December 2014. It is not disclosed what values were used to calculate the indices. The report also contained a composite index calculated from the four towns. The values that will be used in this section of the mathematical investigation will be the values from the composite index of the report.

The list of the composite values are shown in Table 3. Month Inde x Jan 47.9 2010 Feb -- 123. 5 Mar 83.7

Mont h Jan 2011 Feb --

Inde x 63.6 68.4

Mont h Jan 2012 Feb --

Mar

85.8

Mar

Apr

85

Apr

Apr

May

99.5

May

179. 5 76.3

Jun

125. 6 165. 3 72.7

Jun

160. 7 125. 7 171.

Jun

Jul Aug

Jul Aug

May

Jul Aug

Inde x 31.6 162. 7 61.2 221. 6 154. 6 259. 1 120. 1 122.

Mont h Jan 2013 Feb --

Mont h Jan 2014 Feb --

Jul

Inde x 102. 9 119. 5 176. 8 124. 5 159. 6 139. 7 87.5

Aug

137.

Aug

Mar Apr May Jun

Inde x 69.7

Mar

145. 1 96.2

Apr

61

May

135. 5 104. 7 316. 4 153.

Jun Jul

6

Candidate Name: Alex Evat Lineekela Hawala Candidate number: 0015 6 Sep 120. Sep 237. 7 7 Oct 97.4 Oct 53.2

Sep

Nov

95.2

Nov

Nov

Dec

83.6

Dec

102. 8 100. 1

Oct

Dec

1 120. 1 253. 9 131. 3 83.9

Sep Oct

3 70.2

Sep Oct

Nov

108. 4 90.6

Dec

80.3

Dec

Nov

7 134. 8 277. 2 108. 8 107. 5

Table 2

The tally from the composite indices was performed once again. Table 3 shows the frequency of the numbers 1-9 as first digits of the indices respectively. First Digit 1 2 3 4 5 6 7 8 9 Total

Number of occurrences 31 5 1 1 2 5 3 7 5 60

Frequency 51.7% 8.3% 1.7% 1.7% 3.3% 8.3% 5.0% 11.7% 8.3% 100%

Table 3

7

Candidate Name: Alex Evat Lineekela Hawala Candidate number: 0015 Figure 4 displays the frequencies with the graph of

f ( x )=log 10(

n+1 ) n

to

test the relationship between the two.

Frequency of first digits in Buildings Indices 60.0% 50.0% 40.0%

30.1% 100.0% 200.0% 300.0% 400.0% 500.0% 600.0% 700.0% 800.0% 900.0% 51.7% 17.6% 20.0% 12.5% 9.7% 7.9% 6.7% 5.8% 4.6% 10.0% 5.1% 8.3% 8.3% 8.3% 30.0%

0.0%

1

2

1.7% 3

1.7% 4

3.3% 5

6

7 5.0%

8 11.7%

9

(n) First Digit of Number Frequency

Benford's Law

Figure 4

It was found that the values from the NSA did not follow Benson’s law as closely as the values from the Fibonacci sequence. This could be due to the fact that the values were manipulated via human interaction and therefore did not correlate with the Law, as the values were not naturally recorded, and were made up of different values. This may imply that Benford’s law only works with data sets that contain naturally occurring/recorded numbers. In addition the data used in the investigation was also altered in the manner that the data was given an indices threshold of 300, which creates a statistical bias. This bias then has the effect of neglecting any values over 300.

8

Candidate Name: Alex Evat Lineekela Hawala Candidate number: 0015

Conclusion In conclusion, I found that Benford’s Law does not apply to every set of data, particularly data in which humans had a great influence, such as in the Namibia Statistics Agency Building Indices. In these cases, a maximum threshold was created, which created a bias in the data set. Other causes in the case of the Building Indices was that the values in the data set were compiled from different values, of which the sources were not disclosed in the report. However, it was found that the Benford Index applied to data set composed of natural sequences, such as the Fibonacci sequence. In addition, in the example of the Arizona Treasury manager’s case of fraud in 1993, the Benford Index can also be used to detect fraud in financial data, as human interference can be easily detected.

9

Candidate Name: Alex Evat Lineekela Hawala Candidate number: 0015

List of Sources Benford, Frank. 1938. "The law of anomalous numbers." Proceedings of the American Philosophical Society 551–572. Namibia Statistics Agency. 2015. "http://www.nsa.org.na/files/downloads/187_Building%20Plans.pdf." Namibia Statistics Agency. February 16. http://www.nsa.org.na/files/downloads/187_Building%20Plans.pdf. Newcomb, Simon. 1881. "Note on the frequency of use of the different digits in natural numbers." American Journal of Mathematics 39-40. Nigrini, Mark J. 1999. I've Got Your Number. May 1. http://www.journalofaccountancy.com/issues/1999/may/nigrini. Weisstien, Eric W. n.d. "Benford's Law" -- from Wolfram MathWorld. http://mathworld.wolfram.com/BenfordsLaw.html. Wolfram Alpha. 2015. first 200 fibonacci numbers - Wolfram|Alpha. February 20. http://www.wolframalpha.com/input/? i=first+200+fibonacci+numbers.

10

Candidate Name: Alex Evat Lineekela Hawala Candidate number: 0015

Appendix Positio Number in sequence n 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39

1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711 28657 46368 75025 121393 196418 317811 514229 832040 1346269 2178309 3524578 5702887 9227465 14930352 24157817 39088169 63245986 11

Candidate Name: Alex Evat Lineekela Hawala Candidate number: 0015 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83

102334155 165580141 267914296 433494437 701408733 1134903170 1836311903 2971215073 4807526976 7778742049 12586269025 20365011074 32951280099 53316291173 86267571272 139583862445 225851433717 365435296162 591286729879 956722026041 1548008755920 2504730781961 4052739537881 6557470319842 10610209857723 17167680177565 27777890035288 44945570212853 72723460248141 117669030460994 190392490709135 308061521170129 498454011879264 806515533049393 1304969544928660 2111485077978050 3416454622906710 5527939700884760 8944394323791460 14472334024676200 23416728348467700 37889062373143900 61305790721611600 99194853094755500 12

Candidate Name: Alex Evat Lineekela Hawala Candidate number: 0015 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129

160500643816367000 259695496911123000 420196140727490000 679891637638612000 1100087778366100000 1779979416004710000 2880067194370820000 4660046610375530000 7540113804746350000 12200160415121900000 19740274219868200000 31940434634990100000 51680708854858300000 83621143489848400000 135301852344707000000 218922995834555000000 354224848179262000000 573147844013817000000 927372692193079000000 1500520536206900000000 2427893228399980000000 3928413764606870000000 6356306993006850000000 10284720757613700000000 16641027750620600000000 26925748508234300000000 43566776258854900000000 70492524767089100000000 114059301025944000000000 184551825793033000000000 298611126818977000000000 483162952612010000000000 781774079430987000000000 1264937032043000000000000 2046711111473990000000000 3311648143516980000000000 5358359254990970000000000 8670007398507950000000000 14028366653498900000000000 22698374052006900000000000 36726740705505800000000000 59425114757512700000000000 96151855463018400000000000 155576970220531000000000000 251728825683550000000000000 407305795904081000000000000 13

Candidate Name: Alex Evat Lineekela Hawala Candidate number: 0015 130 659034621587630000000000000 131 1066340417491710000000000000 132 1725375039079340000000000000 133 2791715456571050000000000000 134 4517090495650390000000000000 135 7308805952221450000000000000 136 11825896447871800000000000000 137 19134702400093300000000000000 138 30960598847965100000000000000 139 50095301248058400000000000000 140 81055900096023500000000000000 141 131151201344082000000000000000 142 212207101440105000000000000000 143 343358302784187000000000000000 144 555565404224293000000000000000 145 898923707008480000000000000000 146 1454489111232770000000000000000 147 2353412818241250000000000000000 148 3807901929474030000000000000000 149 6161314747715280000000000000000 150 9969216677189300000000000000000 151 16130531424904600000000000000000 152 26099748102093900000000000000000 153 42230279526998500000000000000000 154 68330027629092400000000000000000 155 110560307156091000000000000000000 156 178890334785183000000000000000000 157 289450641941274000000000000000000 158 468340976726457000000000000000000 159 757791618667731000000000000000000 160 1226132595394190000000000000000000 161 1983924214061920000000000000000000 162 3210056809456110000000000000000000 163 5193981023518030000000000000000000 164 8404037832974140000000000000000000 165 13598018856492200000000000000000000 166 22002056689466300000000000000000000 167 35600075545958500000000000000000000 168 57602132235424800000000000000000000 169 93202207781383200000000000000000000 170 150804340016808000000000000000000000 171 244006547798191000000000000000000000 172 394810887814999000000000000000000000 173 638817435613191000000000000000000000 174 1033628323428190000000000000000000000 175 1672445759041380000000000000000000000 14

Candidate Name: Alex Evat Lineekela Hawala Candidate number: 0015 176 2706074082469570000000000000000000000 177 4378519841510950000000000000000000000 178 7084593923980520000000000000000000000 179 11463113765491500000000000000000000000 180 18547707689472000000000000000000000000 181 30010821454963500000000000000000000000 182 48558529144435400000000000000000000000 183 78569350599398900000000000000000000000 184 127127879743834000000000000000000000000 185 205697230343233000000000000000000000000 186 332825110087068000000000000000000000000 187 538522340430301000000000000000000000000 188 871347450517368000000000000000000000000 189 1409869790947670000000000000000000000000 190 2281217241465040000000000000000000000000 191 3691087032412710000000000000000000000000 192 5972304273877740000000000000000000000000 193 9663391306290450000000000000000000000000 194 15635695580168200000000000000000000000000 195 25299086886458700000000000000000000000000 196 40934782466626800000000000000000000000000 197 66233869353085500000000000000000000000000 198 107168651819712000000000000000000000000000 199 173402521172798000000000000000000000000000 200 280571172992510000000000000000000000000000

15

View more...

Comments

Copyright ©2017 KUPDF Inc.
SUPPORT KUPDF