Search Header Logo
5.5 Lesson: Data Storage and Persistence

5.5 Lesson: Data Storage and Persistence

Assessment

Presentation

Computers

9th Grade

Practice Problem

Easy

Created by

Klea h

Used 3+ times

FREE Resource

11 Slides • 10 Questions

1

5.5 Data Storage and Persistence

By Klea h

2

Spider bots crawl around the web and create metadata: data about the data.


Really big data companies have data centers that hold up to 50 petabytes of data, around 25 trillion pages of text, which is around 179 times the amount of the Library of Congress’ content.

But can we continue indefinitely collecting and storing BIG data, in hopes that we may use it someday?



Structuring can help us utilize this BIG data more efficiently. But it still is not enough.

3

Multiple Choice

What is the main job of spider bots (also called web crawlers)?

1

To design websites

2

To create social media posts

3

To crawl the web and generate metadata for indexing

4

To block harmful websites

4

Multiple Choice

Why do websites get indexed by spider bots?

1

To make the websites easier to hack

2

To organize data for faster searches

3

To increase ad revenue

4

To delete old content

5

​How storage structures can help?

In a text, it saves information about the amount of time each word appears, in order to compress data. What would this new information accomplish? How is it useful? How can I put it back to its original form?

  • Keeps original data safe
    You can change or delete metadata without affecting the original data.

  • Helps find and organize info
    Metadata makes it easy to sort or search for specific info, like common words in song lyrics.

  • Makes data more useful
    Metadata lets you quickly pull important info, like how often a word appears in many songs.

  • Adds structure to data
    Metadata helps you organize and compare data in new ways, like ranking songs by word length.

6

Multiple Choice

Why is structuring big data important?

1

It hides personal information

2

It makes data easier to delete

3

It helps us use and understand data more efficiently

4

It keeps hackers away

7

Multiple Choice

What is a challenge of continuing to collect Big Data indefinitely?
A. Computers might stop working
B. Data can become too large to manage effectively
C. The internet will slow down
D. Storage space is infinite

1
B. Data can become too large to manage effectively
2
G. Data can be deleted at any time
3
F. All data will be useful
4
E. Data collection will always be accurate

8

​INDEX: One method of metadata is index



Index is an alphabetical list of names and subjects (conceptual topics) with references to the places they occur.


Why do search engines search through indexes of webpages instead of webpages itself when a search query is performed?  

  • It is more efficient, and less time consuming. Although it takes time creating an index, you create it once and then reuse for different keywords.

  • Indexes help make it faster and easier to find specific information.

  • Instead of searching through all the data, an index points to where important parts are stored—like a table of contents in a book. This saves time and processing power, which is especially helpful when dealing with huge data sets.

9

media

10

media

​ORIGINAL DOCUMENT VS ITS METADATA

11

Another form of metadata is concordance.

In concordance every word - single or plural is recognized as separate words.


Indexes are like a table of contents — they help you quickly find where data is stored. They list key terms or fields and point to their location in the dataset.

Concordances are more detailed — they show every place a word or phrase appears, often with a bit of context. Think of it like a search with examples.

  • Indexes = Fast access to data locations (used for speed and efficiency)

  • Concordances = Deeper insight into how terms are used (used for analysis and meaning)

Bible has a concordance, or lawyers use them as well.

12

media

13

Multiple Choice

Why do search engines search through indexes instead of actual webpages?

1

Indexes contain full versions of each webpage

2

Indexes are hidden from the public

3

Indexes are faster and more efficient for finding information

4

Indexes are easier to delete

14

Multiple Choice

What is the key difference between an index and a concordance?
A. Concordances show the exact location of every word with context; indexes list conceptual topics
B. Indexes are used for print books only
C. Indexes are always longer than concordances
D. There is no difference—they are the same

1

Concordances show the exact location of every word with context; indexes list conceptual topics

2

Indexes are primarily for academic texts; concordances are for fiction

3

Concordances are only used in digital formats; indexes are for print

4

Indexes are always longer than concordances

5

There is no difference—they are the same

15

Multiple Choice

Which is true about concordances?

1

They skip repeated words

2

They combine all word forms into one entry

3

They are primarily for poetry analysis

4

They track every individual word and its location

5

They are only used for religious texts

16

  • PII (Personal Identifiable Information) includes:

    • Name, address, phone number

    • Credit card info, medical records, etc.

  • We often share PII online (e.g., shopping, signing up for accounts)

media

​DEACTIVATING VS DELETING ACCOUNTS

17

Risks of storing Personal Identifiable Information:

  • Can be sold to third parties without your permission

  • May be used in harmful or unexpected ways

  • Social media dangers:

    • Posts about location or schedules can help criminals

    • Example: Sharing vacation plans could lead to break-ins

  • Technology tracks your data:

    • Search engines, websites, and apps track your activity

    • Info like your location, browsing history, and device data is collected

  • Be cautious:

    • Think before posting personal details

    • Your data is valuable—and others may try to misuse it

18

Multiple Choice

What does PII stand for?

1

Protected Internet Information

2

Private Internet Identity

3

Public Info Index

4

Personal Identifiable Information

19

Multiple Choice

Which of the following is an example of sharing PII online?

1

Watching a YouTube video

2

Liking a meme

3

Entering your address on an online store

4

Changing your wallpaper

20

Multiple Choice

What is a possible danger of posting your location on social media?

1

You could miss a new post

2

You might lose followers

3


Your post might not get enough likes

4

Criminals might use that info to target you

21

Vocabulary (Simplified)

  • Relational Database: A way to organize and access data using related tables.

  • Generation Loss: Quality loss from copying analog data; doesn't happen with digital if unchanged.

  • Browser: A program to access and view websites (e.g., Chrome, Firefox).

  • Metadata: Data that describes other data (like info about a photo or webpage).

  • Data vs. Information: Data = raw facts; Information = processed and meaningful data.

  • Data Persistence: Data that stays saved, even after deletion or long periods of no use.

  • Data Storage: Places where data is kept (like CDs, USBs, memory, or tapes).

  • Indexing: Organizing data to keep track of it efficiently.

  • Filter Bubble: When algorithms limit what info you see based on your past behavior.

  • Privacy Concerns: Digital data can be copied, shared, or sold more easily.

  • Utility: How useful something is — like trading data for a benefit.

  • Cache: Temporary storage for quick access to frequently used data.

5.5 Data Storage and Persistence

By Klea h

Show answer

Auto Play

Slide 1 / 21

SLIDE