The difficulties that we faced were on two levels, finding an appropriate theoretical model and tailoring it to SuperLink.
The Bayesian
network is traversed many times. In each traversal node values are randomized,
base on the probability tables.
Expect nodes
are nodes that hold the data in a Bayesian network that doesn’t just add
up to 1. The sum on the values of these nodes, given their predecessor’s
values, will be less than 1. It can be proved that the multiplication of
the generated values of these nodes converges to the exact inference value
of the network.
The implementation is constricted
to a single module that performs the approximate reduction (MonteCarlo.[ch]).
There is a need to set general definitions in settings.h. There is a significant
amount of documentation in the code itself.
Implementing the algorithm
in SuperLink required us to disable some of SuperLink's features. SuperLink
tries to reduce the number of nodes in the network and thereby reduce computations.
This reduction increases the number of expect nodes, and thereby increases
the number of computations needed by the approximation algorithm. Another
difficulty was the isolation of the probability table for a specific node.
A theoretical bound computed according to [1] can be used, but is not recommended since it requires too much time. The program can be supplied with the number of traversals to be made by the approximation algorithm.
In all the
examples we used, exact inference worked faster by far. The main problem
that the SuperLink environment puts before this algorithm is the usage
of logical values (i.e. zero and one probabilities) or probabilities that
are close to logical. A simple example is described in diagram 1; the lower
node is the expect node. Notice that the sum on its values given the predecessor’s
value is not 1. In 99 out of a 100 we will randomize the value 1 for the
top node, thereby the probability of the expect node will be zero. In the
last of 100 traversals we will randomize the value 0 for the top node and
following, the expect node will get 0.5 as its probability. The division
of 0.5 by 100 (sum of probabilities for expect nodes, divided by number
of traversals made) will give us the end result. But what if we only do
50 traversals, or if in the 100 traversals we did we did not hit 0 in the
first node...
The example supplies an intuitive understanding of the problem, where
logical and close to logical values are used there is a need for a very
large number of traversals in order to reach convergence to the inferred
value.
In addition a theoretical bound on the number of traversal validates this view as shown in [1].
We used the MakePed utility in order to create the input files. Following are the parameters that we used:
Number of loci desired in the data file  2
Place of the disease locus  1
Program isn't sexlinked
Number of null loci  0
Number of alleles per locus  3
Type of locus no. 0  alleles type locus
Type of locus no. 1  alleles type locus
Recombination value  0.1 / 0.3 / 0.5 (3 possible values for
every assigned size of pedigree)
No multiplemarriages
No loops
Size of pedigree  3 to 18
Following are sample runs on randomly generated input files. The files were varied on two criteria, pedigree size and recombination factor. The pedigree size was expected to make the Bayesian network larger and more difficult to handle for both tested approaches. The recombination factor was considered as a means to affect the logical values problem.
Values are in loglikelihood, time for calculation
is in seconds. The numbers next to MonteCarlo stand for the number of traversals
over the network.
Pedigree
Size 
Recombination
Factor 
Deterministic Inference

MonteCarlo 2000

MonteCarlo 20000

MonteCarlo 200000




Value

Time

Value

Time

Value

Time

Value

Time

3

0.1

2.747035

0.000000

2.748870

0.160000

2.743392

1.270000

12.657098

8.290000

4

0.1

2.580212

0.000000

2.683621

0.170000

2.620569

1.420000

2.599492

9.950000

5

0.1

6.111203

0.000000

6.199487

0.330000

6.085544

2.040000

6.120306

16.040000

5

0.3

5.706751

0.000000

5.667138

0.330000

5.697237

3.080000

5.711430

17.080000

5

0.5

5.365020

0.050000

5.296761

0.160000

5.426840

1.260000

5.376147

8.570000

6

0.1

6.698798

0.000000

6.600389

0.330000

6.667788

3.130000

6.671289

20.430000

6

0.3

7.671187

0.000000

7.582318

0.330000

7.645987

3.730000

7.665147

18.730000

6

0.5

8.625642

0.000000

8.584443

0.220000

8.640189

1.430000

8.623507

8.570000

7

0.1

8.974668

0.000000

9.571315

0.440000

9.015012

3.790000

8.969255

23.950000

7

0.3

8.046899

0.000000

8.035737

0.390000

8.039623

4.500000

8.047561

22.470000

7

0.5

7.172967

0.000000

7.060475

0.220000

7.106232

2.250000

7.183487

12.020000

8

0.1

7.172967

0.000000

7.060475

0.220000

7.106232

2.250000

7.183487

12.020000

8

0.3

9.690607

0.000000

9.950005

0.490000

9.620946

4.500000

9.696056

24.440000

8

0.5

9.186951

0.000000

8.910845

0.280000

9.308785

2.420000

9.162657

15.160000

9

0.1

8.755576

0.060000

0.000000

0.710000

0.000000

5.820000

8.420663

58.550000

9

0.3

11.232086

0.000000

0.000000

0.600000

11.305155

6.200000

11.191212

32.730000

9

0.5

11.040140

0.000000

0.000000

0.330000

0.000000

3.460000

11.167107

18.120000

10

0.1

7.919520

0.060000

0.000000

0.710000

7.921716

5.930000

7.946076

37.680000

10

0.3

8.837329

0.000000

0.000000

5.380000

0.000000

32.240000

9.049059

305.670000

10

0.5

9.956336

0.000000

0.000000

0.330000

9.815304

3.250000

10.162091

18.510000

11

0.1

12.230570

0.000000

0.000000

0.660000

12.781721

6.210000

12.389024

38.720000

11

0.3

10.367531

0.000000

10.517442

0.610000

10.341351

5.280000

10.382309

31.740000

11

0.5

12.204816

0.000000

0.000000

0.380000

12.151225

3.620000

12.090527

19.170000

12

0.1

13.708077

0.050000

0.000000

0.770000

13.529714

6.270000

13.383586

39.600000

12

0.3

12.149852

0.000000

0.000000

0.940000

0.000000

7.080000

12.262694

46.090000

12

0.5

14.045091

0.000000

0.000000

0.330000

14.335848

3.840000

13.974120

21.140000

14

0.1

13.246331

0.160000

0.000000

1.100000

13.029461

7.470000

13.401072

54.380000

14

0.3

16.064232

0.060000

0.000000

1.210000

0.000000

9.170000

16.271196

101.060000

14

0.5

19.663534

0.000000

0.000000

0.490000

0.000000

4.670000

0.000000

25.490000

16

0.1

14.958027

0.000000

0.000000

1.210000

0.000000

8.570000

0.000000

90.400000

16

0.3

15.982722

0.050000

0.000000

1.210000

0.000000

8.680000

0.000000

60.030000

16

0.5

15.482457

0.000000

15.483322

0.550000

15.484372

4.610000

15.482537

24.880000

18

0.1

23.364095

0.060000

0.000000

1.100000

0.000000

8.290000

0.000000

63.440000

18

0.3

17.241915

0.000000

0.000000

1.160000

0.000000

8.680000

0.000000

63.270000

18

0.5

17.795665

0.000000

0.000000

0.550000

17.911613

5.550000

17.911613

30.260000

We ran 3 of the above examples for a longer period
of time:
Pedigree
Size 
Recombination
Factor 
MonteCarlo 50000

MonteCarlo 2000000




Value

Time

Value

Time

14

0.5



0.000000

368.390000

16

0.3 
0.000000

144.460000



18

0.3

0.000000

163.630000



This approximation algorithm to the inference of a Bayesian networks seemed to have failed. We can point to several reasons for this:
Executable for Windows (.zip)
Paul
Dagum and Michael Luby.
An optimal approximation algorithm for Bayesian inference.
Artificial Intelligence, 93:127, 1997.
An
Optimal Algorithm for Monte Carlo Estimation (PS)
Robert
Fung and KuoChu Chang.
Weighing and integrating evidence for stochastic simulation in Bayesian
networks.
In Uncertainty in Artificial Intelligence 5, pages 209219,
New York, N. Y., 1989. Elsevier Science Publishing Company, Inc.