Math Is Fun Forum
  Discussion about math, puzzles, games and fun.   Useful symbols: ÷ × ½ √ ∞ ≠ ≤ ≥ ≈ ⇒ ± ∈ Δ θ ∴ ∑ ∫ • π ƒ -¹ ² ³ °

You are not logged in.

#1 2013-07-24 07:15:53

zxcvbnm123
Member
Registered: 2013-05-08
Posts: 15

Counting probability

i need some help counting probability using Bayes theorem

given that 1/10 of message is a spam message. spam detector will correct identify a message as spam 89 percent of the time. the spam detector will correct identify a message as not spam 89 percent of the time too. The detector has just detect a message as being spam. what is the probability that the message is actually not a spam message?

i have come out with some numbers which i am really not sure i am correct

someone please guide me with this. i am really confused.

Last edited by zxcvbnm123 (2013-07-24 23:57:20)

Offline

#2 2013-07-24 07:27:40

SteveB
Member
Registered: 2013-03-07
Posts: 535

Re: Counting probability

You want:

P (not spam | flag)

P (not spam | flag) = P (not spam and flag) / P (flag)

How do we work out P (not spam and flag) ?

given that 1/10 of message is a spam message.

Does this mean that 0.1 of all messages are flagged as spam ?
Or that 0.1 of all messages are spam ?

So P(flag) = 0.1
Or is it P(spam) = 0.1

spam detector will correct identify a message as spam 89 percent of the time.
the spam detector will correct identify a message as not spam 89 percent of the time too.

Last edited by SteveB (2013-07-24 07:56:05)

Offline

#3 2013-07-24 07:51:08

zxcvbnm123
Member
Registered: 2013-05-08
Posts: 15

Re: Counting probability

sorry can elaborate more? i just assume that i understand the theory and guess the number. if i am wrong please tell me in more details.

Offline

#4 2013-07-24 07:59:55

SteveB
Member
Registered: 2013-03-07
Posts: 535

Re: Counting probability

The detector has just detect a message as being spam. what is the probability that the message is actually not a spam message?

So we are looking for: P(not spam given that the message is flagged as spam)
Or using your notation: P(~spam | flag)

I am a little confused myself and I may need to think some more about this.....

I am not sure either way at the moment I am trying to draw a tree diagram on paper to help ....

Last edited by SteveB (2013-07-24 08:22:57)

Offline

#5 2013-07-24 08:09:25

zxcvbnm123
Member
Registered: 2013-05-08
Posts: 15

Re: Counting probability

the question is pretty confusing for me too. but i think what i need to look for is P(not spam given that the message is flagged as spam).

Offline

#6 2013-07-24 08:35:00

SteveB
Member
Registered: 2013-03-07
Posts: 535

Re: Counting probability

I am wondering whether it might help to draw a diagram similar to this adapting where needed
to suit the problem:

          Spam event    Flag event   
Spam  - true              -true         spam and flag
                                               
                                -false         spam and not flag
                                               
          - false             -true         not spam AND flag
                                               
                                -false        not spam AND not flag

In theory Bayes theorem can calculate P (A and B) = P(A | B) x P(B)

The trouble is do we really know the values to plug in to that formula ?

Offline

#7 2013-07-24 08:42:22

SteveB
Member
Registered: 2013-03-07
Posts: 535

Re: Counting probability

(0.10*0.11 + 0.9*0.89)

To my mind this reads as P(flag | not spam) x P(flag) + P(~flag) x p(~flag | not spam)
Which using Bayes simplifies to: P (flag and not spam) + P (~flag and not spam)
Which perhaps also simplifies to: P (not spam)

0.10*0.11 / (0.10*0.11 + 0.9*0.89)

This I thought came from: P(flag and not spam) / P(not spam)

Is that what you meant and why you made that calculation?

If you let A = flag
and let B = not spam
Then apply Bayes of P (A and B) = P (A | B ) x P (B)
then this supports this logic.
But P(A | B) = P (flag | not spam)
unless I am mistaken we want P (not spam | flag)
So we want P (B | A) perhaps ?

P (B | A) x P(A) = P(B and A)

The trouble with that argument is that it does not work. Perhaps your answer is correct.

because P(B and A) = P (A and B)
so P (B | A) = P(A and B) / P(A)

The problem with that is that it gives us simply the answer 0.11 which does not make sense.
It would be daft to ask that question so are you sure you have formulated everything correctly
and consistently.
Make sure you use the term "flag" consistently and do not confuse with "spam".
Is 0.1 definately the probability that a random message is spam?
Not 0.1 being the probability that a random message is flag?

   given that 1/10 of message is a spam message.

   Does this mean that 0.1 of all messages are flagged as spam ?
   Or that 0.1 of all messages are spam ?

   So P(flag) = 0.1
   Or is it P(spam) = 0.1 ? (I think this version is more likely, but you have used the other)

I will have to stop for now. I am sharing your confusion about this it is a complicated problem
and the formulation of the probabilities originally is not very clear.

Last edited by SteveB (2013-07-24 09:37:32)

Offline

#8 2013-07-24 17:05:37

SteveB
Member
Registered: 2013-03-07
Posts: 535

Re: Counting probability

I think the answer might be: (0.099 / (0.089 + 0.099)

Reason:

Let us assume that P(spam) = 0.1
and that P(not spam) = 0.9

Now I will consider the two possibilities of flag and not flag as if it were an event afterwards:

Spam Event           Flag Event          Probability of Spam Event AND Flag Event
  True(0.1)             True (0.89)                       0.089 *               
  True(0.1)             False(0.11)                       0.011

  False(0.9)              True (0.11)                     0.099 *
  False(0.9)              False(0.89)                     0.801

The two that I have put * next two are very significant in this because both concern a
situation where the flag event is true. So in these cases the message has been flagged
as spam. So the total probability of this is (0.089 + 0.099).
We are also interested in finding the probability of the situation where the spam event
was false but the flag event was true individually. This is (0.099).
Hence P(not spam | flag ) = 0.099 / (0.099 + 0.089) = 0.526596 (to 6 dp.)
If you do that style of analysis with the assumption that P(flag) = 0.1
then it gives a rather silly answer of 0.11 - silly because it only uses half the tree
and does not use Bayes or any conditional logic and is just the probability of an incorrect
flagging of something not a spam under this assumption on the basis of (1-0.89) = 0.11
This seems too simple to be true for this type of problem.
So I prefer the answer 52.65 .... etc

Last edited by SteveB (2013-07-24 17:09:28)

Offline

Board footer

Powered by FluxBB