# What is Git? Git is a tool that tracks changes to your files over time. Think of it as an "undo history" for your entire project that you can browse, search, and share with others. ## Commits: Snapshots of Your Project A **commit** is like taking a photo of your entire project at a specific moment in time. Every time you make a commit, Git: 1. Records what **all** your files look like right now 2. Adds a message describing what changed 3. Notes who made the change and when 4. Links back to the previous commit ### Every Commit Contains Everything This is important: each commit is a complete snapshot of **all** files in your project - not just the files you changed. You can check out any commit and see the entire project exactly as it was. But wait - doesn't that waste a lot of space? No! Git is clever about this. ### Unchanged Files Are Reused If a file hasn't changed since the last commit, Git doesn't store a new copy. Instead, it simply points to the version it already has: ``` Commit 1 Commit 2 Commit 3 ───────── ───────── ───────── README.md ─────────────────────────────> (same) app.py ────────> app.py (v2) ───────> (same) config.json ──────> (same) ────────────> config.json (v3) ``` In this example: - `README.md` never changed - all three commits refer to the same stored version - `app.py` changed in Commit 2, so a new version was stored - `config.json` changed in Commit 3, so a new version was stored This means: - Every commit gives you the **complete picture** of your project - Git only stores **new content** when files actually change - Going back to any point in history is instant - no need to "replay" changes ``` Commit 3 Commit 2 Commit 1 | | | v v v [Add login] <-- [Fix bug] <-- [First version] ``` Each commit points back to its parent, creating a chain of history. You can always go back and see exactly what your project looked like at any point. ## The Magic of Checksums Here's where Git gets clever. Every commit gets a unique ID called a **checksum** (or "hash"). It looks like this: ``` a1b2c3d4e5f6g7h8i9j0... ``` This ID is calculated from the **contents** of the commit - the files, the message, the author, and the parent commit's ID. Why does this matter? ### Verification If even one character changes in a file, the checksum becomes completely different. This means: - Git instantly knows if something has been corrupted or tampered with - You can trust that what you downloaded is exactly what was uploaded ### Finding Differences When you connect to another copy of the repository, Git compares checksums: ``` Your computer: Server: Commit A Commit A (same checksum = identical) Commit B Commit B (same checksum = identical) Commit C (missing) (you have something new!) ``` Git doesn't need to compare every file. It just compares the short checksums to instantly know what's different. ## Distributed: Everyone Has a Full Copy Unlike older systems where one central server held all the history, Git is **distributed**. This means: - Every person has a complete copy of the entire project history - You can work offline - commit, browse history, create branches - If the server disappears, anyone's copy can restore everything - You sync with others by exchanging commits ``` [Alice's Computer] [Bob's Computer] | | Full history Full history All branches All branches | | +---- [Shared Server] ------+ | Full history All branches ``` When Alice pushes her new commits to the server, Bob can pull them down. The checksums ensure nothing gets lost or corrupted in transit. ## Putting It All Together 1. **You work** - Edit files, create new ones, delete old ones 2. **You commit** - Take a snapshot with a descriptive message 3. **Git calculates** - Creates a unique checksum for this commit 4. **You push** - Send your commits to a shared server 5. **Others pull** - Download your commits using checksums to verify 6. **History grows** - The chain of commits gets longer That's it! Git is essentially a distributed database of snapshots, connected together and verified by checksums. Everything else - branches, merges, rebasing - builds on these simple ideas. ## Key Takeaways - **Commit** = A snapshot of your project at one moment - **Checksum** = A unique fingerprint calculated from the content - **Distributed** = Everyone has a full copy, not just the server - **History** = A chain of commits, each pointing to its parent You don't need to understand every detail to use Git effectively. Just remember: commit often, write clear messages, and sync with your team regularly.