It's one higher.
Fine.
XL means it's based on and only runs with the SDXL model(s) and software(and VRAM) that can handle it.
So, the model needs to match the LORA and whatever software you use has to support it.
SDXL is generally trained on 1024x1024 images compared to the base Stable Diffusion at 512x512. So that's the optimal size to start with.
You can run it with an 8GB video card, but it's a tight fit, and in my case I have to close most of my web browser windows as they use too much GPU memory, fucking Chrome.
Few more notes, the base SDXL comes in 2 parts. The main model and a refiner model. It's optimized for those two and most software supports the 2 stages. Almost every other model you'll find doesn't use the refiner as they've been trained without it.
You will sometimes see models that say "Baked VAE" or "Needs VAE" in this case you may need to tell your software to use the VAE in the model(Baked) or a separate one. Using the wrong option causes all kinds of visual artifacts. There is a standard SDXL VAE that's used for most models where it's not "Baked". VAE in simple terms converts the output of the model, a Latent image, into an actual human viewable image.