KF Math Thread - Discuss Math

  • Want to keep track of this thread?
    Accounts can bookmark posts, watch threads for updates, and jump back to where you stopped reading.
    Create account
I've given the problem some more thought (enough thought to lose edit permission on my previous post), and I've managed to reformulate it much more nicely.

Firstly, the defragmentation transform τ really only shuffles around the data blocks (otherwise, we'd end up losing existing data). If we write the data on the disc/tape as a function:
d: [1, n]∩ℕ -> B
Then τ is a bijection (permutation) on Dom(d) which means it's an element of the group S_n.
This lets us write the cost of the defragmentation more simply as either the total distance we move/swap the blocks (e.g. Σ|τ(k)-k|) or the number of places swapped (e.g. |{1 : k in N, τ(k) = k}|) for k=1,2,...,n.

Lastly, if we have a set of data items written to disc/tape D = { D_i }, with each data item being a family of indices corresponding to the location of each of its data blocks in storage (i.e. D_i = { i_1, i_2, ..., i_k } ⊆ Dom(d), D_i pairwise disjoint), then we can write the fragmentation of a single block D_i as:

For ascending i_k in D_i, with k = 1, 2, ..., m
Σ_{k=1}^{m-1} ( i_{k+1) - i_k - 1 )

This counts all the non-consecutive blocks making up data block D_i.

There's a couple of cool things about this approach. For one, S_n is pretty well researched, so there's probably some cool combinatorial/algebraic method I'm unaware of. Second, we've completely avoided evaluating d(k), instead replacing it with just the size of the disc/tape. And lastly, if we make our defrag cost look at data item indices, we only have to work with one type of object (which may or may not be easier).

This sounds like combinatorial optimisation, something like minimum linear arrangement, but I'm no expert in optimisation, much less so in discrete optimisation.
 
This is the one thread I've seen on here after all these years where I find myself wishing the Farms had embedded LaTeX support.
 
Not exactly embedded, but I threw together a quick browser extension that will render math in posts on the forum. Firefox users can get it from AMO here. Chomelets can pester me in private messages to get a .zip with the unpacked extension. No git forge as of yet, but I can send source tarballs to anyone keen.

You can use this to see if it's working: $\mathbb Nigger$. $$\mathbb N^iG_Ge \mathbf R$$ It should also work in previews so you don't accidentally embarrass yourself with your shitty LaTeX skills. If you click the addon icon it'll rerun on the current page. Also $$ doesn't work across multiple lines right now.
 
Not exactly embedded, but I threw together a quick browser extension that will render math in posts on the forum. Firefox users can get it from AMO here. Chomelets can pester me in private messages to get a .zip with the unpacked extension. No git forge as of yet, but I can send source tarballs to anyone keen.

You can use this to see if it's working: $\mathbb Nigger$. $$\mathbb N^iG_Ge \mathbf R$$ It should also work in previews so you don't accidentally embarrass yourself with your shitty LaTeX skills. If you click the addon icon it'll rerun on the current page. Also $$ doesn't work across multiple lines right now.
$K = FC^2$
 
Chomelets can pester me in private messages to get a .zip with the unpacked extension.
Attaching the .xpi (renamed to .zip) here for posterity. Good work.

Edit: You can right click the "Add to Firefox" button on the AMO page and copy the .xpi link if it ever gets updated. .xpi is just a .zip under the hood, so rename it if needed to load it as a debug plugin.
 

Attachments

Last edited:
I humbly come before you to ask for intermediate statistics book recommendations, I've discovered my knowledge of it is lackluster and very unrefined
Ideally I'm looking for something that's more math oriented, I'm not scared of integrals

So far I've been rec'd:
- Wasserman
- Mukhopadhay
- Casella and Berger
 
I see what you did there :tomgirl:
And Anti Snigger said they're not afraid of calculus...

More seriously, I have quite a few statistics / data science books (mostly leaning towards the latter, though there is overlap). I curated and kept those that are in more of a axiom-proof-theorem or similarly mathematically rigorous format to recommend to others as I am more into the "shut up and code" approach. I could probably do some digging though.
 
I humbly come before you to ask for intermediate statistics book recommendations, I've discovered my knowledge of it is lackluster and very unrefined
Ideally I'm looking for something that's more math oriented, I'm not scared of integrals

So far I've been rec'd:
- Wasserman
- Mukhopadhay
- Casella and Berger
See, I already have an idea:
Top-rated comment recommends "Elements of Statistical Learning. The bible, the one and only".
Follow-up comment says: "To build on this, a much more gentle version of this is a free book called Introduction to Statistical Learning. Has a version for both Python and R."

I didn't know there was a Python version but I considered both ESL and ISL (R version) to have made the cut for my archives. There are some lukewarm reviews of the Python version of ISL that refer to the book's use of a specialized library, ISLP, for its own didactic purposes, which builds on existing Python standard and third-party libraries. Well, the R book does the same thing. I don't think this is a serious impediment, and I've seen the like in the past. The book and companion code are there to teach you concepts, which is the real hard part, and from there learning to use the libraries without an added layer of abstraction should be relatively easy. You can of course just look at the code here (and it is not incredibly complex, not even the deep learning stuff):
 
Slightly tangential: what do you think of data science
In what sense? I don't have really strong opinions about it if that's what you mean

And Anti Snigger said they're not afraid of calculus...
I'm not! :mad:

I didn't know there was a Python version but I considered both ESL and ISL (R version) to have made the cut for my archives. There are some lukewarm reviews of the Python version of ISL that refer to the book's use of a specialized library, ISLP, for its own didactic purposes, which builds on existing Python standard and third-party libraries. Well, the R book does the same thing. I don't think this is a serious impediment, and I've seen the like in the past. The book and companion code are there to teach you concepts, which is the real hard part, and from there learning to use the libraries without an added layer of abstraction should be relatively easy. You can of course just look at the code here (and it is not incredibly complex, not even the deep learning stuff):
I don't really have a strong preference for either language here, provided that I can understand what techniques and algorithm is at play, it's not too hard to translate from one to the other
I appreciate the rec, I've added it to the org file
 
In what sense? I don't have really strong opinions about it if that's what you mean
It's the most hands-on way to do both traditional statistics and machine learning. There's a quote with various attributions (doesn't really matter who said it) and that quote is: "Tell me and I'll forget; show me and I may remember; involve me and I'll understand."
I appreciate the rec, I've added it to the org file
Oh you use Org-mode? Any pointers?
 
Oh you use Org-mode? Any pointers?
Honestly, just start learning it as a markdown clone, then learn stuff like code blocks, making custom keyword highlighting for a file, etc.
Don't go into it trying to be a power user, you'll have a bad time.
I've switched from markdown to org and I much prefer it
 
I have a bit of an open-ended question that I don't myself have an answer for.
Do you think mathematics is overly proof focused or not? If so, why?
I definitely think proofs must exist, but I've been finding more and more with stuff that I work on or read about that the proofs themselves don't really aid me in problem solving.

Really curious to hear takes on this.
 
Back
Top Bottom