 |
Sudoku Players' Forums
|
| View previous topic :: View next topic |
| Author |
Message |
PatmaxDaddy
Joined: 18 Oct 2005 Posts: 63 Location: Massachusetts, USA
|
Posted: Sat Feb 11, 2006 2:51 pm Post subject: |
|
|
I've got several results related to RxC band counting:
1) A vastly improved method of generating equivalence classes for subband state vectors, one that results in far fewer classes (probably optimally few), and which can be used for any K-box subband of an RxC band, not just a 2-box subband of 5xC. I've got a fully general implementation of this method that has been tested on 5xC and which will be used for 6xC and maybe higher values of R.
2) A general startegy for RxC band counting based on precomputing counts of successively larger subbands, using the above equivalence classes. The counts are stored in tables indexed by a representative member of each equivalence class of subband state vectors. The tables are sorted for O(log(C)) lookup. Note that this strategy is one possible way to implement Kjell's method, it is not a different method for band counting.
3) A general analysis of the algorithmic complexity of RxC band counting using my strategy for implementing Kjell's method. This analysis shows that the most efficient strategy is to split bands in half (as near as possible), as Kjell originally speculated:
| kjellfp wrote: | | I'm pretty sure the fastest result is achieved by letting M be as close to L/2 as possible | ; but contrary to his later suggestion: | kjellfp wrote: | | If you are precomputing the 2-box subband numbers as for 5xC-counting, and then looking up the values at the final count, I think it will be faster if your first band-breaking will be in one subband of size 2 and one of size 4, instead of two 3-size subbands. Then you break the 4-subband into 2-subbands. |
4) The 5x8 band count, which I can now do in only 10 hours using the new equivalence classes.
Details
In general an RxK subband (K RxC boxes) is described by a state vector with sel(R,K) integer components (sel is the binonial R!/(R-K)!K!), where each component corresponds to a selection of K rows from the R available, with the value equal to the number of symbols that are found in all K rows of the subband. All subbands with the same state vector have the same count. An equivalence class of state vectors can be defined as all vectors that can be generated by any permutation of the R rows. This is a far more comprehensive definition of equivalence classes than I used above in dividing the 10-vector for 5x2 subbands into a pair of 5-element rings, and it applies to any RxK subband. The difficulty in applying this definition is generating a canonical member of a class from any given vector very quickly, since this has to be done in the innermost loop. I finally implemented a reasonably efficient and completely general way to do this. One trick is to use C++ template classes. R and K are template parameters, so the compiler can generate code for any R and K, but R and K are compile-time constants so the compiler can do better optimization.
Here is a comparison of the old and new method:
| Code: |
previous method current method
C classes time classes time
1 3 <1 2 <1
2 19 <1 7 <1
3 69 <1 16 <1
4 208 18 37 5
5 505 276 73 62
6 1089 3552 140 680
7 2106 32202 245
8 415 36375
9 664
10 1031 |
Here is the 5x8 result:
| Code: | (5*8)! * (8!)^20 * 39119289737902332276642894251428026550280700096
= 4115726412595961837618972705021617780954992462923399713992174350142637439412909696
6959611961180887311841288992195822126180369412041063915275822234464904282112000000
00000000000000000000000
= 4.11573e+186 | [Edit: Value corrected 20 Feb 2006]
The RxK state vector is subject to R independent constraints, so there are sel(R,K)-R degrees of freedom (DOF). Note that if K = 1, there are no degrees of freedom, as expected (there's only one way to do it: all of the sel(R,1) components of the state vector must be [Edit] C). To split an RxK subband into RxL and Rx(K-L) subbands (an RxKxL split), one must have sel(K,L) = sel(K,K-L) parameters for each of the sel(R,K) components of the RxK state vector, so the splitting is described by a [sel(R,K) * sel(K,L)]-component split vector. In my above post of 5xC, the 30-vector b describes a split where R = 5, K = 3, L = 2, which the reader can confirm has 30 components. A split vector is subject to sel(R,K) + R - 1 independent constraints, and so has sel(R,K) * (sel(K,L) - 1) - R + 1 DOFs.
Note that an RxK state vector is equivalent to an RxRxK split vector as one might expect. The number of components and the number of DOFs is the same. Thus we can consider state vectors to be just special cases of split vectors.
Now consider 6xC counting. First we compute the counts for each equivalence class of the 6x2 subbands and store them in a sorted table. This requires splitting the 6x2 subband into two 1-box subbands, so R = 6, K = 2, L = 1. The 6x2 state vector has 9 DOFs and the 6x2x1 split vector has 10 DOFs, so the table can be made in O(C^19). Next we compute the counts for each equivalence class of the 6x3 subbands and store them in another table. 6x3 state vectors have 14 DOFs, and 6x3x2 split vectors have 35 DOFs. Since the 6x2 counts can be looked up in O(log(c)), the 6x3 table can be made in O(C^49 * log(C)). Finally, each 6x3 subband in boxes 1 - 3 corresponds to a unique (but different) subband in boxes 4 - 6. Using the 6x3 lookup table, the final count can be done in O(C^14 * log(C)). Thus the overall complexity is O(C^49 * log(C)).
The method just described for 6xC can easily be generalized for any RxC, and the complexity can easily be determined.
As an alternative for 6xC, we could do a 4-2 split instead of a 3-3 split. We'd only need the 6x2 table, but the 6x4x2 split is very expensive, with 70 DOFs. Overall we'd have a complexity of O(C^79 * log(C)).
I'm planning to implement 6xC, possibly using both 3-3 and 4-2 splits so I can check one against the other. It will take some time, however, because my spare time is quite limited.
Last edited by PatmaxDaddy on Mon Feb 20, 2006 3:12 pm; edited 1 time in total |
|
| Back to top |
|
 |
deam3r
Joined: 12 Feb 2006 Posts: 10 Location: Where ever you can imagine me i guess!
|
Posted: Sun Feb 12, 2006 8:04 pm Post subject: |
|
|
also!
Code:
| Code: | 1 3
2 19
3 69
4 208
5 505
6 1089
7 2106 |
Yall would need it! |
|
| Back to top |
|
 |
deam3r
Joined: 12 Feb 2006 Posts: 10 Location: Where ever you can imagine me i guess!
|
Posted: Sun Feb 12, 2006 8:04 pm Post subject: |
|
|
sorry, You allready posted the code..
| Code: | Code:
1 3
2 19
3 69
4 208
5 505
6 1089
7 2106
|
|
|
| Back to top |
|
 |
PatmaxDaddy
Joined: 18 Oct 2005 Posts: 63 Location: Massachusetts, USA
|
Posted: Sat Feb 18, 2006 6:25 pm Post subject: |
|
|
Thought someone might find the following interesting. Define N(R,C) such that the RxC band 1 count is (R*C)! * (C!)^[R*(R-1)] * N(R,C). This is the conventional way we show the count. Here is a plot of log10(N) as a function of C for R=3 (blue), R=4 (yellow), and R=5 (magenta). The lines are best fit, but ignoring some of the low values of C.
The lines look straight on the plot, but they not quite for low values of C; the slope of the curve increases very slightly at each point. They do seem to converge to straight lines for large C, but they converge very slowly. I ran 3xC out to C=15000, and 4xC out to C=100. I couldn't quite tell if 4xC was going to converge, but 3xC seems to do so. The 3xC slope ends up around 0.9031, 4xC around 3.1, and 5xC is around 6.5 or so for the low values of C that I've been able to run. |
|
| Back to top |
|
 |
Red Ed
Joined: 06 Jun 2005 Posts: 927
|
Posted: Sun Feb 19, 2006 10:48 am Post subject: |
|
|
| PatmaxDaddy wrote: | | Thought someone might find the following interesting. ... The 3xC slope ends up around 0.9031 | Perhaps you knew this anyway, but N(3,C) is the Cth Franel number. Asymptotically, this is O(8^C), confirming that the slope on your graph should be log10(8) ~= 0.9031. |
|
| Back to top |
|
 |
PatmaxDaddy
Joined: 18 Oct 2005 Posts: 63 Location: Massachusetts, USA
|
Posted: Mon Feb 20, 2006 3:43 pm Post subject: |
|
|
| I've reentered the corrected values for 5x7 and 5x8, which were lost when the database was restored from backup a few days ago. The post describing the changes is lost, but essentially the errors were caused by overflows that I have since corrected. The challange is knowing when to switch from using native data types that are very fast but can overflow silently, to my arbitrary-precision fixed point class, which cannot overflow but which is much slower. I'm being more careful now. |
|
| Back to top |
|
 |
PatmaxDaddy
Joined: 18 Oct 2005 Posts: 63 Location: Massachusetts, USA
|
Posted: Mon Feb 20, 2006 3:50 pm Post subject: |
|
|
| Red Ed wrote: | | Perhaps you knew this anyway, ... | No, I've never heard of Franel numbers, thanks for pointing that out. I learned enough math to get an MS in electrical engineering, which is a lot but not nearly as much as the regulars who post here know.
I was hoping that these graphs might offer some clue about how the RxC values behave in general, a clue that might allow us to estimate higher values than those we can compute, and maybe use them to estimate grid counts above 6x6. Anyone have any ideas? |
|
| Back to top |
|
 |
kjellfp
Joined: 04 Oct 2005 Posts: 140 Location: Norway
|
Posted: Wed Feb 22, 2006 2:53 pm Post subject: |
|
|
| PatmaxDaddy wrote: | | Here is a plot of log10(N) as a function of C for R=3 (blue), R=4 (yellow), and R=5 (magenta). |
Thanks for a very interresting diagram. I got the impression that N(R,C+1)/N(R,C) had a limit for C -> infinity when I first produced the results for 4xC, but your diagram really demonstrates this visually.
As a mathematician I'd like to determine whether these limits can be calculated explicitely. |
|
| Back to top |
|
 |
kjellfp
Joined: 04 Oct 2005 Posts: 140 Location: Norway
|
Posted: Wed Feb 22, 2006 10:21 pm Post subject: |
|
|
| PatmaxDaddy wrote: | | The 3xC slope ends up around 0.9031 |
Red Ed has already pointed out that the limit is log10(8) = 0.903089987.
I was giving it a thought when I went to bed yesterday.
N(3,C) = sum_k bin(C,k)^3. The most dominating terms are for k near C/2.
We have bin(C,k)/bin(C-1,k-1) = C/k, for k near C/2 this is close to 2. For three-powers, this is close to 8.
So when we sum some of the terms for N(3,C) where k is close to C/2, and divide by the corresponding terms for N(3,C-1), we get something close to 8. The terms further away from C/2 are much smaller.
This is how I can imagine that N(3,C)/N(3,C-1) approaches 8. I don't think it's too difficult to give a strict mathematical proof which more formally handles the side effects of the lower terms.
We might even read the corresponding ratios for higher values of R from the formulaes or the algorithm. |
|
| Back to top |
|
 |
kjellfp
Joined: 04 Oct 2005 Posts: 140 Location: Norway
|
Posted: Thu Feb 23, 2006 1:56 am Post subject: |
|
|
| PatmaxDaddy wrote: | | The 3xC slope ends up around 0.9031, 4xC around 3.1, and 5xC is around 6.5 or so for the low values of C that I've been able to run. |
A corresponding argument to the one used for 3xC has been applied to general R by looking at the RxC algorithm. It looks like N(R,C)/N(R,C-1) will approach (R-1)!^R when C goes to infinity. Some log10-values of this:
R=3: 0.903090
R=4: 3.112605
R=5: 6.901056
R=6: 12.475087
etc.
I'll come back to an explaination on my approximations. |
|
| Back to top |
|
 |
Red Ed
Joined: 06 Jun 2005 Posts: 927
|
Posted: Thu Feb 23, 2006 10:32 am Post subject: |
|
|
| kjellfp wrote: | | This is how I can imagine that N(3,C)/N(3,C-1) approaches 8. I don't think it's too difficult to give a strict mathematical proof which more formally handles the side effects of the lower terms. | I went with the recurrence relation n^2 F(n) = (7n^2 - 7n + 2) F(n-1) + 8(n - 1)^2 F(n-2).
As n->infinity, this is F(n) = 7F(n-1) + 8F(n-2), so F(n) = multiple of (-1)^n or 8^n. |
|
| Back to top |
|
 |
kjellfp
Joined: 04 Oct 2005 Posts: 140 Location: Norway
|
Posted: Thu Feb 23, 2006 1:40 pm Post subject: |
|
|
This implies that there is a reccursion formula for N(3,C) with respect to C.
This of course raises the question: Is there a reccursion formula for N(R,C) for any R? How deep is it? Could this be used to determine PatmaxDaddy's 5xC-sequence further in just a fraction of a second? |
|
| Back to top |
|
 |
kjellfp
Joined: 04 Oct 2005 Posts: 140 Location: Norway
|
Posted: Fri Feb 23, 2007 12:45 pm Post subject: 4xC reccursion formula |
|
|
| One year ago, I wrote: | | This of course raises the question: Is there a reccursion formula for N(R,C) for any R? How deep is it? Could this be used to determine PatmaxDaddy's 5xC-sequence further in just a fraction of a second? |
1. Yes, most certainly
2. VERY deep
3. Definitely not, because (a) I don't think we will find the formula, and (b) still if we did, we would have to precalculate much more 5xC-values than we have now before the reccursion formula could be used.
All this because I'm sure I have found the 4xC reccursion formula. I have not proved it explicitely (and I don't think I will ever dear), but there are good reasons to believe it's correct.
Let R(C) be the number of RxC-bands divided by (4C)! * C!^12. From the experience with 3xC (the Franell numbers) it's reasonable to guess that R(C) fits into som reccursion forumla
| Code: | | F_0(C) * R(C) + F_1(C) * R(C-1) + ... + F_N(C) * R(C-N) = 0 |
where the F_i are polynomials in C.
With some precalculated values of R(C), it's possible to experiment with systems of linear equations to get the polynomial coefficients.
This was best to do modulo a prime first, to get an idea of the reccursion depth and polynomial degrees. Afterwards, I did the calculation with integers and some more R(C) values than those just needed to get the coefficients. The linear system had one non-trivial solution for the guessed degree and depth. Some of the polynomials also had linear factors which fitted into the pattern I could sense in the Franell numbers formula.
For 3xC, depth was 2 and polynomial degree was 2. The coefficients were not bigger than 16.
For 4xC, depth seems to bee 13, polynomial degree is 8, and the biggest coefficient is 2906501472489100167249389420544.
No need to go for the 5xC formula...
The 4xC formula found is given in the next post. |
|
| Back to top |
|
 |
kjellfp
Joined: 04 Oct 2005 Posts: 140 Location: Norway
|
Posted: Fri Feb 23, 2007 12:46 pm Post subject: Re: 4xC reccursion formula |
|
|
The reccursion formula:
If the number of 4xC bands is (4C)! * C!^12 * R(C), then
| Code: | 71 * F_0(C) * R(C)
+ 8 * F_1(C) * R(C-1)
+ 64 * F_2(C) * R(C-2)
+ 512 * F_3(C) * R(C-3)
+ 4096 * F_4(C) * R(C-4)
+ 32768 * F_5(C) * R(C-5)
+ 262144 * F_6(C) * R(C-6)
+ 4194304 * F_7(C) * R(C-7)
+ 201326592 * F_8(C) * R(C-8)
+ 6442450944 * F_9(C) * R(C-9)
+ 618475290624 * F_10(C) * R(C-10)
+ 178120883699712 * F_11(C) * R(C-11)
+ 1846757322198614016 * F_12(C) * R(C-12)
+ 9573589958277615058944 * F_13(C) * R(C-13)
= 0 |
where
| Code: | F_0(C) = - C^6 * (C-1)^2
F_1(C) = (C-1)^2 * [ 171326 - 810515*C + 1560076*C^2 - 1564443*C^3 + 870263*C^4 - 263804*C^5 + 37311*C^6 ]
F_2(C) = - 830839294 + 3290371761*C - 5802615765*C^2 + 5955176757*C^3 - 3888997199*C^4 + 1652951755*C^5
- 445790012*C^6 + 69619894*C^7 - 4813367*C^8
F_3(C) = 312760427352 - 912788970642*C + 1165253562453*C^2 - 850797588265*C^3 + 389100442107*C^4
- 114310113540*C^5 + 21104767851*C^6 - 2243702981*C^7 + 105425743*C^8
F_4(C) = 3720103021142 - 3696846458479*C - 785535474276*C^2 + 2860434508346*C^3 - 1842357443022*C^4
+ 602603994380*C^5 - 111628655770*C^6 + 11210591680*C^7 - 477587165*C^8
F_5(C) = - 797132481675646 + 1299489231533963*C - 916808862420949*C^2 + 364682084171454*C^3
- 89088752639394*C^4 + 13595829346800*C^5 - 1251273614246*C^6 + 62117635822*C^7 - 1212479636*C^8
F_6(C) = 3696733170991086 - 7177797506790705*C + 5553140250884337*C^2 - 2323537859876068*C^3
+ 585627694081732*C^4 - 91933956451664*C^5 + 8826580250024*C^6 - 475273401252*C^7 + 11002070436*C^8
F_7(C) = 79322922412116840 - 83425281275823387*C + 37598964042238470*C^2 - 9402299518397928*C^3
+ 1405635810567280*C^4 - 124924683004160*C^5 + 6004008163344*C^6 - 108710373088*C^7 - 844212016*C^8
F_8(C) = - 161437697999164515 + 165705982988634480*C - 74306699350927977*C^2 + 19004527819592400*C^3
- 3030377651015080*C^4 + 308277136686240*C^5 - 19521269783672*C^6 + 702736586472*C^7 - 10994652516*C^8
F_9(C) = 183648673281809322 - 177108542506870569*C + 74792146023444209*C^2 - 18063629156976634*C^3
+ 2728888216770598*C^4 - 264044128636920*C^5 + 15979121117030*C^6 - 552929120870*C^7 + 8375525860*C^8
F_10(C) = - 31970888811536868 + 28990697313774654*C - 11506943511519192*C^2 + 2611310834912427*C^3
- 370582947612417*C^4 + 33678338511700*C^5 - 1914087918974*C^6 + 62202279563*C^7 - 884918269*C^8
F_11(C) = 739046950653744 - 638403821233752*C + 240673279532528*C^2 - 51736892746984*C^3 + 6938268781591*C^4
- 594536866366*C^5 + 31795428556*C^6 - 970416058*C^7 + 12942869*C^8
F_12(C) = (C-11)^4 * [ 2363016 - 452*C - 130144*C^2 + 16689*C^3 - 603*C^4 ]
F_13(C) = - (C-11)^4 * (C-12)^4 |
|
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|