KF Math Thread - Discuss Math

  • 🐕 I am attempting to get the site runnning as fast as possible. If you are experiencing slow page load times, please report it.
ZFC is an easy to understand, intuitive and practical axiomatic system that does pretty much anything a mathematician might need, I never understood the sperging about other axiomatic systems.
 
ZFC is an easy to understand, intuitive and practical axiomatic system that does pretty much anything a mathematician might need, I never understood the sperging about other axiomatic systems.
Of course for like 90% of what mathematicians are doing, ZFC is more than adequate, mostly because 90% of mathematics doesn't even particularly rely on the foundations at all as long as they're reasonable. There are examples where subtleties involving the axioms of choice, the continuum hypothesis, etc. can have a non-trivial effect, but they're not usually relevant (you bringing up non-measurable sets earlier is interesting example, since if you drop choice then you can produce models of the real numbers where there are none). There are situations (e.g. in category theory) where one needs to assume things about sufficiently large sets existing, but I've never figured out quite what the precise issue is (something about functor categories?).
An argument for alternative foundations like type theory is that if you care about computer proof systems, then type theory is a very natural way to approach that since it's both a possible foundation for mathematics as well as one for computer science. There's also some other ideas in this direction with trying to have formal systems that allow one to reason about things like infinity-categories as though they're not autistic as fuck, but as far as I can tell none of them are particularly successful at that.
 
  • Informative
Reactions: Vecr and N Space
There are examples where subtleties involving the axioms of choice, the continuum hypothesis, etc. can have a non-trivial effect, but they're not usually relevant (you bringing up non-measurable sets earlier is interesting example, since if you drop choice then you can produce models of the real numbers where there are none).
AFAIK said models assume the existence of inaccesible cardinals which is IMO a very bold axiom. The Axiom of Choice has some nasty consequences like the infamous Banach-Tarsky Paradox, however, these nasty results don't disapear with vanilla ZF, they just become undecidable so non-measurable sets are still a PIA either way.
There are situations (e.g. in category theory) where one needs to assume things about sufficiently large sets existing, but I've never figured out quite what the precise issue is (something about functor categories?).
Ah yes, this is necesary to define many categories like the category of sets without suffering from paradoxes.
An argument for alternative foundations like type theory is that if you care about computer proof systems, then type theory is a very natural way to approach that since it's both a possible foundation for mathematics as well as one for computer science. There's also some other ideas in this direction with trying to have formal systems that allow one to reason about things like infinity-categories as though they're not autistic as fuck, but as far as I can tell none of them are particularly successful at that.
That seems like a reasonable argument though I know too little to comment.
 
I have been going down a rabbit hole of "esoteric" series representations.
Everyone knows Taylor/MacLaurin, and likely Fourier, but have you seen Padé Approximants?
1718472368539.png
 
Whats a good place to start when it comes to learning that math that is involved with machine learning?
Machine learnig "math" is called Numerical methods, or at least it's the building blocks for "machine learning". These were my main sources when I learned the basics of it:
Assuming you know some basic linear algebra and calculus. Keep in mind, these are an approach from the perspective of physicist using these methods, so they might get a little more in depth of constructing the model itself rather than the "data training" part that a programmers maybe would focus on etc. Computer nerds would often just try to bruteforce problems with more data or iterations to make their models work better, in these resources I provided they try to improve the model itself and limit the amount of data/iterations you are allowed to use. If it doesn't fit you, ignore this post.

Fun little project you can play around with is facial recognition, as you can see in the video, it's not that complicated once you understand the basics numerical methods, it's pretty intuitive. I know some /g/ faggots are going to seethe at the fact it's Matlab, but it's the math/algorithm that is the focus here.
 
Last edited:
There are situations (e.g. in category theory) where one needs to assume things about sufficiently large sets existing, but I've never figured out quite what the precise issue is (something about functor categories?).
For the working mathematician, just saying "okay it's actually a class, my bad" and adding "locally small" to your categories works well enough. After all, most of the "useful" parts of CT usually involve the locally small Set (i.e. Yoneda).
But seeing how CT is also about finding and abstracting over patterns, you soon hit stuff like 2/n/infinity-categories, or just large categories in general, where you have to be really careful on how you talk about objects.

What you're referring to I think is an issue that comes up when talking about (locally) small categories with the usual universe construction. Let U be a universe, and let locally small categories be those with hom-sets that are U-small. If you consider functor categories for locally small categories, how large are the functor categories? We know the hom-sets are in U, but the categories themselves might not be in U (they're only locally small). You have to either define a larger universe, or assume that one exists in order to talk about functor categories in this case.

This happens basically whenever you need to look at >2-cats and I think the universe construction is so popular because it's the least amount of red tape needed to get back to good ol' CT autism compared to other set theories, though it might be interesting seeing what lies at the bottom of category construction when you consider other exotic set theories.
 
This looks like a good book; its sections for differential equations look promising. Here's a PDF (tor mirror) for the 2nd edition.
Combine that book with "Mathematical Methods for Scientists and Engineers" by Donald A. McQuarrie. You are now very well equipped to solve most shit that will show up. The book is pretty much a "recipe cookbook" to solve differential equations, you try some of the methods given and hopefully it will work out in your case.

If the content in these books (I have recommended in this thread) doesn't cover your problems that you are trying to solve, I would really be questioning the model/problem set up itself given to you in the first place, since now you are on borderline mathematical research level of math/physics.
 
Last edited:
  • Like
Reactions: y a t s
Combine that book with "Mathematical Methods for Scientists and Engineers" by Donald A. McQuarrie. You are now very well equipped to solve most shit that will show up. The book is pretty much a "recipe cookbook" to solve differential equations, you try some of the methods given and hopefully it will work out in your case.

If the content in these books (I have recommended in this thread) doesn't cover your problems that you are trying to solve, I would really be questioning the model/problem set up itself given to you in the first place, since now you are on borderline mathematical research level of math/physics.
McQuarrie is pretty solid, saved my ass in quantum. Those bessel equations are a bitch
 
What you're referring to I think is an issue that comes up when talking about (locally) small categories with the usual universe construction. Let U be a universe, and let locally small categories be those with hom-sets that are U-small. If you consider functor categories for locally small categories, how large are the functor categories? We know the hom-sets are in U, but the categories themselves might not be in U (they're only locally small). You have to either define a larger universe, or assume that one exists in order to talk about functor categories in this case.
Your reply prompted me to finally bother to go think properly about how to do things about size in a sensible way, and I think I have it figured out now. In part because your phrasing made me see the correct way to look at things. I suddenly see how to actually make use of this hierarchy of universes U₁, U₂, ..., that's usually mentioned.
On a related note, I do think it's kind of remarkable that set theory enters into category theory at all, since (at least to me) it always felt like category theory should be this purely syntactic thing (which I guess it is if you work in a type theoretic setting, probably). The thing that comes to mind is the small object argument, but maybe it's obvious why set theory comes into play if one actually understands the statement and proof (I don't, yet). There's also things like localizations of categories, where you can take a small category and localize it at a (necessarily small) set of morphisms and end up with a category which is not locally small (because one has to zigzag too many arrows). On the other hand, this is not an interesting example to bring up because all the localizations anyone ever cares about can be proven to be locally small (either because one is localizing at a "multiplicative system" as in e.g. Kashiwara & Schapira's book Categories & Sheaves, or because one is localizing a model category). Other, less precise examples that I can think of are just to do with large categories themselves being necessary to do interesting mathematics, it seems. I've seen people doing things like K-theory claim such things, at least (e.g. in one of Dustin Clausen's recent talks on Efimov K-theory at the IHES). So somehow controlling the sizes of sets cannot be avoided.

But seeing how CT is also about finding and abstracting over patterns, you soon hit stuff like 2/n/infinity-categories, or just large categories in general, where you have to be really careful on how you talk about objects.
This I want to ask about, though, because I don't see the relation between n-categories (1 < n ≤ ∞) and size issues. Of course, one must be more careful about how one defines smallness (e.g. because, of course, in a moral sense objects will only be defined up to some notion of equivalence, and that notion could be really disrespectful of set theory; a point is homotopy equivalent to a disk, and obviously they are quite different in size), but I don't see why the situation is fundamentally different from ordinary 1-category theory. Notably, one can of course do an entire theory of n-categories purely inside a fixed universe, so everything is small (just as one can do with 1-categories). The standard approach to ∞-categories following Lurie, Joyal, etc., is basically this, since one models them with simplicial sets (and these naturally assume a fixed universe with which to form the category Set).
 
  • Like
Reactions: Briarthorn Addict
but I don't see why the situation is fundamentally different from ordinary 1-category theory
The standard approach to ∞-categories following Lurie, Joyal, etc., is basically this, since one models them with simplicial sets
Technically, it's the same problem as in a 1-category, but I think morally it spirals out and you can see this in the limiting case where you need a bit more subtlety than just gluing additional supersets.

The issue of where these objects live is something you have to deal with either way. The only difference, I guess, is whether you posit it existentially or constructively; either you climb up to an ∞-category (and fight the universe hydra), or you go top down and define everything and its place. You're still dealing with explaining how you're accessing whatever you're talking about and you have to make sure it makes sense and holds up.

Is it fundamentally a difference? Probably not, but the list of things you have to keep track of grows with how enriched your category is, and that's exponentially more diagram chasing, even if under the hood.

It's a bit annoying that set theory should come up so often in category theory - wasn't the whole idea to subsume it? Especially in fundamental matters like being able to talk about categories, it gets all rather messy pretty quickly. That's one of the reasons why I prefer types, they feel like a much better fit to what CT is about; though I can't dispute the wealth of results reassessing sets has brought to categories.

Also, I've only now realized I've been reading it as simplical all this time (because vertex -> vertical, right?)
 
I've become addicted to taking an operator T, then creating an associated operator U = exp(T), and applying it to power series or taylor series.
As long as T has a closed form for being raised to a given power, you can typically end up having one for U.
A great example is U=exp(k*D) where D is the derivative, you end up with U f(x) = f(x+k). I actually saw this operator used without explanation or justification in a quantum class, (in the form of exp(p), where p is the momentum operator), and it bothered me so much that I had to do the derivation myself.

Another fun thing is to get a representation of your operator as an infinite matrix that multiplies the polynomial basis vectors, then take the transpose and see what that gives you
 
I've become addicted to taking an operator T, then creating an associated operator U = exp(T), and applying it to power series or taylor series.
As long as T has a closed form for being raised to a given power, you can typically end up having one for U.
A great example is U=exp(k*D) where D is the derivative, you end up with U f(x) = f(x+k). I actually saw this operator used without explanation or justification in a quantum class, (in the form of exp(p), where p is the momentum operator), and it bothered me so much that I had to do the derivation myself.

Another fun thing is to get a representation of your operator as an infinite matrix that multiplies the polynomial basis vectors, then take the transpose and see what that gives you
Oh, quick note, make sure you understand your operator's properties, otherwise you'll just be wrong.
For instance, with the logarithmic derivative operator, it's not associative, so if you apply it to a function, you can't pull it into the taylor series
 
I have a couple of questions about e and the natural logarithm.

1) I have seen, and accept, several proofs and justifications that d(a^x)/dx = ln(a)*a^x. For example, here's a screenshot from this 3Blu1Brown video where Prof. Sanderson plugs 2^t into the definition of the derivative and factors 2^t out:
exponentproof.jpg
I can see that the part in parentheses is the "inverse exponential" definition of ln(2) with the n parameters replaced with their reciprocals, and I accept this. My question is more philosophical: When going from dt=1 to dt=0, dy/dt goes from 2^t to ln(2) * 2*t. Where does this logarithm come from? To be clear, I am not asking why the base of this logarithm is e, of all numbers, a point covered in the video. I am asking why shrinking dt to 0 causes the derivative to be scaled down proportionally by this constant. Where does this constant come from? And why is it logarithmic with the base (of the exponent) as input?

2) Somewhat related to question 1. Is there a reason this scaling constant is also the area under 1/x? In other words, is this relationship a coincidence?
exp-nat-int.png
 
I have a couple of questions about e and the natural logarithm.

1) I have seen, and accept, several proofs and justifications that d(a^x)/dx = ln(a)*a^x. For example, here's a screenshot from this 3Blu1Brown video where Prof. Sanderson plugs 2^t into the definition of the derivative and factors 2^t out:
View attachment 6321358
I can see that the part in parentheses is the "inverse exponential" definition of ln(2) with the n parameters replaced with their reciprocals, and I accept this. My question is more philosophical: When going from dt=1 to dt=0, dy/dt goes from 2^t to ln(2) * 2*t. Where does this logarithm come from? To be clear, I am not asking why the base of this logarithm is e, of all numbers, a point covered in the video. I am asking why shrinking dt to 0 causes the derivative to be scaled down proportionally by this constant. Where does this constant come from? And why is it logarithmic with the base (of the exponent) as input?

2) Somewhat related to question 1. Is there a reason this scaling constant is also the area under 1/x? In other words, is this relationship a coincidence?
View attachment 6321403
So, logarithms are simply the conversion of one exponential base into another. The natural log is the conversion to and from the natural base, e. e^x by divine provenance has the property that it's the unit eigenfunction of / idempotent under the derivative and integral operators, and is in effect, the natural basis for any differential equation in one form or another. If you were comfortable using 2^x, you could do all of your calculus and just have some abstract factor k that appears when doing derivatives and integrals, but by inspection, it will always be the natural log of 2.

I realize this isn't terribly satisfying, but as far as my understanding goes, that's why
 
Last edited:
I've become addicted to taking an operator T, then creating an associated operator U = exp(T), and applying it to power series or taylor series.
As long as T has a closed form for being raised to a given power, you can typically end up having one for U.
A great example is U=exp(k*D) where D is the derivative, you end up with U f(x) = f(x+k). I actually saw this operator used without explanation or justification in a quantum class, (in the form of exp(p), where p is the momentum operator), and it bothered me so much that I had to do the derivation myself.

Another fun thing is to get a representation of your operator as an infinite matrix that multiplies the polynomial basis vectors, then take the transpose and see what that gives you
I'm currently on break, but I considered what you said about the operator U. One can devise the function h on the reals such that h(x) = exp(-1/x^2) for x > 0 and h(x) = 0 for x <= 0. This is a smooth function with all the derivatives of h at x = 0 being zero. For this, one would have U(h)(0) = 0 for any chosen k. However, h(k) > 0 for k > 0. Thus U(h)(x) != h(x + k) when x = 0.

This got me thinking about what sort of functions f would satisfy U(f)(x) = f(x + k). Clearly it worked for polynomials, but certainly not so for all smooth functions. So what about analytic functions? It does, and the proof of this is below.
proof1.png
proof2.png
proof3.png
proof4.png
proof5.png
The basic principle behind the proof is that for an analytic function I can show that for small enough shifts of k one has U(f)(x) = f(x + k). By proving some additional facts, I can show that a certain size of k will always be enough, and thus I can use these small shifts to finitely build up to any shift I desire.

So, one has U(f)(x) = f(x + k) for analytic f, which is very good since those are generally the functions we want. For non-analytic smooth functions, such as in the counterexample I provided, I'm sure there are some that do in fact still work, but I'll leave that up as an exercise to those that are interested.

Some additional notes:
  • The symbol of two horseshoes nested into one another is for "compact subset".
  • Lemma 2 depends on an alternate characterization of analytic functions. I was hoping to show that the radius of convergence for a power series continuously varies with the center, but at the end of the day all I need was a minimum radius of convergence for power series representations centered over a compact interval.
  • Although the analytic function f is taken on all the reals, the proof doesn't really change so long as the domain of f is real, open and connected.
 
Last edited:
This got me thinking about what sort of functions f would satisfy U(f)(x) = f(x + k). Clearly it worked for polynomials, but certainly not so for all smooth functions. So what about analytic functions? It doesn't., and the proof of this is below.
A quick correction to the above proof: About a week after I made this post and had moved on from the problem, I mentioned the result and the proof to a colleague of mine. When it came to the series representation I had for U_k, he pointed out potential convergence issues with chosen k. Now, this was something that did occur to me while working on the problem, but I erroneously thought I had it handled with the fact that U_{k+l} = U_k o U_l as in "Statement and Initial Results". However, looking at U_k more carefully: I realized the error, and its subtlety I believe is something worth mentioning.

Take the expression for U_k given in "Statement and Initial Results". See that if one applies: k = ( (x + k) - x), the series becomes precisely the Taylor series of f centered at x evaluated at x+k. This illuminates a few things:
  1. It is obvious then that for entire functions whose power series representation applies to the entire real line one has that U_k(f)(x) = f(x + k) for any x and k - as seen in polynomials and the exponential function for example.
  2. The failure of U_k(f)(x) to evaluate correctly with k > 0 at x=0 for f(x) = exp(-1/x^2) makes sense as what U_k(f)(x) is literally doing is becoming the Taylor series of f(x) at x=0 which is just 0.
  3. Furthermore, for functions such as f(x) = log(x + 1), taking them on their domain (-1,infty), with a center x = 0 one then clearly has k for which U_k(f)(x) = f(x+k) fails, for example k = 3. Now, while f in this case is not analytic R ---> R, the same technique applies to show U_k(f)(x) = f(x+k) failure for analytic-but-not-entire functions as your k can exceed a local radius of convergence.
This is all seen with a simple substitution. Now with this perspective, the potential error in U_{k+l} = U_k o U_l becomes clear: Take U_k o U_l(f)(x). Suppose the power series of f at x is defined at x+l and the power series of f at x+l is defined at x+k+l. Then one has U_l(f)(x) = f(x + l) and thus U_k o U_l(f)(x) = U_k(f)(x+l) = f(x + k + l). However, one might have k,l such that the power series of f at x is not defined at x + k + l. Therefore, it cannot be that U_{k+l}(f)(x) = U_k o U_l(f)(x). The subtlety of the error is that though U_{k+l} and U_k o U_l are algebraically the same power series, analytically they are different - because U_{k+l} fixes its center at x while U_k o U_l has a dynamic center that moves in the stages of composition from x to x+l. Leveraging this is why the rhs can produce the desired shift by convergent increments while the lhs fails to do it all at once.

Now, this error doesn't change anything in the proof, it just changes the result. I have not shown that for analytic f and any x and k one has U_k(f)(x) = f(x + k). By the above that cannot be true. However, what I have shown is that under a family of such operators {U_k_i} any shift is possible. So pick any x and k, one has U_k_1,...,U_k_n for which U_k_n o ... o U_k_1(f)(x) = f(x + k). And once you understand that U_k(f)(x) is the power series of f centered at x evaluated at x+k, this makes perfect sense once one has the "Second Lemma". You start at a point and move forward within the interval of convergence. You stop and take a new power series representation with its own interval convergence and repeat until you get where you want to go. What the "Second Lemma" confirms is that you may always move forward by a certain amount, and thus, inevitably, get there.

Looking at U_k(f)(x) as a Taylor series does present a clean set of sufficient conditions for U_k(f)(x) = f(x + k). For any smooth function f if: (1) f has a power series representation center at x and (2) x+k is in the interval of convergence - then one has the desired equality. I'm not saying "necessary" since I'm not sure of that, but I can see that for any smooth f if one must have a neighborhood of k about 0 for which U_k(f)(x) = f(x + k) then f must be analytic.
 
Last edited:
Based and mathpilled kiwis.

It'd be cool if there was an easy way to add latex support to the forums, like with MathJax or something.
I would definitely like something like that. What I do now if I really need the notation is just type it up in a TeX editor and screencap the pdf.
 
Back