Quiz: AWS Storage: 20+ Questions!
Can you navigate the cloud labyrinth?
Are you down to cloud?! 🤡
Dive deep into AWS Storage Services! This quiz will test your knowledge of S3, DynamoDB, Aurora, RDS, ElastiCache, and more. From best practices to tricky gotchas, we’ll explore the cloud storage landscape.
Get ready to prove your cloud expertise! 🚀
What does the name S3
mean?
S3 stands for Simple Storage Service. It’s a scalable object storage service designed for large-scale data storage.
AWS S3 offers multiple storage classes:
- Standard: For frequently accessed data
- Infrequent Access (IA): Lower cost for less frequent access
- Glacier: Long-term, low-cost archival storage
Each class offers different pricing and access characteristics, allowing cost optimization based on data usage patterns.
What does it mean when DynamoDB is described as “schema-less”?
DynamoDB is considered “schema-less” because it allows you to store arbitrary properties in items without a predefined schema.
Most scalable way to handle many UPDATES to DynamoDB? (for example, backfilling a status=active
column.)
The key here is updates, not inserts or PUTs. If you’re doing inserts, you can use BatchWriteItem
or TransactWriteItems
.
While BatchWriteItem
can handle multiple operations, it’s limited to PUTs and DELETES. TransactWriteItems
is more powerful, but it’s a bit of a sledgehammer for simple updates.
For simple updates, UpdateItem
is the best choice. It allows you to UPDATE, or modify one or more attributes in an existing item.
The UpdateItem
operation is the best way to handle multiple updates. It allows you to modify one or more attributes in an existing item.
The UpdateItem
operation:
- Updates an existing item’s attributes.
- Adds new attributes to an existing item.
- Removes attributes from an existing item.
- Conditionally performs the update if the item exists or meets certain conditions.
Which AWS database service does NOT support full-text search capabilities out of the box?
While many AWS database services offer rich query capabilities:
- OpenSearch: Built specifically for full-text search
- Neptune: Graph queries + full-text search capabilities
- Redshift: SQL with full-text search functions
- DocumentDB: Text search through MongoDB compatibility
ElastiCache (Redis/Memcached) is primarily for caching and doesn’t include built-in full-text search capabilities. (As of late 2024.)
Redis has recently added full-text search modules under their Redis Source Available License
, or RSAL. This license prevents AWS from freely copying the Redis’ folks work into the ElastiCache product.
What is the primary benefit of RDS Multi-AZ deployment?
Availability Zones (AZs) are distinct data centers within a region. RDS Multi-AZ deployment provides automatic failover to a standby replica in a nearby AZ.
Multi-AZ deployment:
- Provides automatic failover
- Increases database availability
- Creates a synchronous standby replica
- Minimizes downtime during infrastructure failures
Don’t confuse Multi-AZ deployment with Read Replicas, which are used for scaling read operations.
👋 I hope you’re having fun so far!
Time for a tricky one…
Which AWS service does NOT support stateful, direct, full-duplex WebSocket connections?
API Gateway largely supports WebSockets, however the implementation is a bit different. Instead of full-duplex direct connections, it’s got a more “stateless” approach. This means you can’t maintain a direct connection between clients and servers. Events are sent via the API Gateway, which dispatches message(s) to the appropriate Lambda function or other service. Messages are relayed back to the client in a similar manner.
The others are much more WebSocket-friendly:
- Lightsail: Perfect for simple WebSocket setups 👌
- App Sync: Uses Web Sockets for real-time data sync
- EC2: Your classic “do whatever you want” option for WebSockets
- EKS: Great for running scalable WebSocket clusters
Pro tip: If you need raw WebSocket power, stick with the compute services!
What’s the recommended approach to S3 bucket permissions?
In virtually ALL systems, embracing a “least privilege” design is a key way to harden & future proof. Trying to lock down an existing system is about as difficult as moving an entire office building to a new foundation.
S3 buckets are no exception. To apply the principle of least privilege, start with no permissions and grant only the necessary access. Use IAM roles and policies to control access and regularly audit bucket permissions.
Security best practices:
- Apply least privilege principle
- Start with no permissions
- Grant only necessary access
- Use IAM roles and policies
- Regularly audit bucket permissions
Avoid overly permissive settings that could expose sensitive data.
What is the key feature of Aurora Serverless?
Aurora Serverless:
- Automatically scales compute capacity
- Adjusts resources based on workload
- Ideal for unpredictable workloads
- Pay only for used resources
Great for applications with variable traffic patterns.
One more DynamoDB batch question!
What’s the maximum number of items you can retrieve using a single DynamoDB BatchGetItem
request?
The DynamoDB SDK allows you to retrieve up to 100 items in a single BatchGetItem
request. This is higher than the limit for BatchWriteItem
, which is 25 items.
Additionally, there are limits on the total payload size, document size, and request rate.
Understanding these limits is crucial for optimizing your application’s performance and ensuring efficient data operations.
Note: It is possible to exceed some of these limits - if you can sweet-talk your AWS account manager. 😎
What’s the maximum number of documents DynamoDB can UPDATE
per batch?
The DynamoDB Clients are essentially all wrappers for its HTTP API. The BatchWriteItem
operation can PUT
or DELETE
up to 25 documents per HTTP request, but it cannot UPDATE
multiple documents.
While DynamoDB can INSERT
up to 25 documents per HTTP request, it can UPDATE
only 1 document per request using the UpdateItem
operation.
When should you use DynamoDB On-Demand capacity?
On-Demand Capacity is best for:
- Unpredictable workloads
- Sporadic traffic
- Applications with unknown access patterns
- Avoiding over-provisioning
Provisioned capacity is better for:
- Predictable, consistent workloads
- More control over performance
- Potential cost savings
How to optimize S3 performance for high request rates?
S3 Performance Tips:
- Use random/hash prefixes in object keys
- Prevents “hot” partitions
- Distributes load across S3 infrastructure
- Improves request distribution
Avoid sequential prefixes which can create bottlenecks.
What’s the recommended RDS backup approach?
Best Backup Practices:
- Enable automated backups
- Use point-in-time recovery
- Retain backups based on compliance needs
- Test restoration process regularly
- Consider cross-region backup
Automated backups provide:
- Continuous data protection
- Flexible recovery options
Key difference between Redis
and Memcached
in ElastiCache
?
Redis Advantages:
- Supports complex data structures
- Persistence options
- Advanced operations
- Pub/Sub messaging
Memcached:
- Simple key-value store
- Pure caching
- High performance for simple use cases
Purpose of Global Secondary Index in DynamoDB?
Global Secondary Index (GSI):
- Allows querying on non-primary key attributes
- Creates alternative access patterns
- Increases query flexibility
- Comes with additional write capacity cost
Useful for complex query requirements beyond primary key.
What does S3 Lifecycle Management enable?
Lifecycle Management:
- Automatically transition objects between storage classes
- Move infrequent data to cheaper storage
- Set rules for object expiration
- Optimize storage costs
- Reduce manual management overhead
What’s the maximum number of read replicas Amazon Aurora supports?
Amazon Aurora supports up to 15 read replicas, allowing you to significantly scale your read operations. These replicas benefit from:
- Near-instantaneous replication across replicas
- Minimal performance impact on the primary instance
- Efficient distribution of read workloads
This setup enables horizontal scaling for applications with heavy read demands.
What encryption capabilities does RDS provide?
RDS Encryption Features:
- Encrypt data at rest using KMS
- Encrypt data in transit using SSL/TLS
- Enable encryption during database creation
- Protect sensitive information
- Compliance with security standards
What is the primary use of DynamoDB Streams?
DynamoDB Streams:
- Capture item-level changes
- Enable event-driven architectures
- Trigger Lambda functions
- Support cross-region replication
- Provide near real-time data movement
Best method for uploading large files to S3?
Multipart Upload Benefits:
- Handle large files efficiently
- Resume interrupted uploads
- Parallel upload of file parts
- Recommended for files > 100MB
- Improved network reliability
What’s the most cost-effective approach for storing 1PB of data with 20% accessed daily, 30% monthly, and 50% yearly?
Optimal Storage Strategy:
- 20% in S3 Standard for daily access
- 30% in S3 Standard-IA for monthly access
- 50% in Glacier for yearly access
This approach optimizes costs while maintaining appropriate access patterns.
Cost Considerations:
- Storage pricing per GB
- Retrieval costs
- Access patterns
- Transition costs
A DynamoDB table has a provisioned read capacity of 100 RCUs. How many strongly consistent reads of 4KB items can be performed per second?
Understanding DynamoDB consistency models is crucial:
- 1 RCU = 1 strongly consistent read/second for items up to 4KB
- 1 RCU = 2 eventually consistent reads/second for items up to 4KB
Therefore:
- 100 RCUs = 100 strongly consistent 4KB reads/second
- 100 RCUs = 200 eventually consistent 4KB reads/second
Choose consistency models based on:
- Application requirements
- Cost considerations
- Performance needs
- Data freshness requirements
In an Aurora cluster with multiple read replicas, what happens during an automatic failover when the primary instance fails?
Aurora Failover Process:
- Detects primary instance failure
- Evaluates replica lag for all replicas
- Promotes replica with lowest replication lag
- Updates cluster endpoint automatically
Best Practices:
- Maintain multiple replicas across AZs
- Monitor replication lag
- Use cluster endpoint in applications
- Test failover scenarios regularly
As of late 2020, what consistency model does S3 provide for all operations?
S3 Consistency Model:
- Strong read-after-write consistency for all operations
- Applies to PUTs and DELETEs
- No need for workarounds previously used
- No additional cost
Impact:
- Simplified application logic
- No need for consistency checks
- Reliable immediate reads after writes
- Improved application reliability
How does DynamoDB’s TTL feature handle item deletion?
DynamoDB TTL Characteristics:
- Background process monitors TTL attribute
- Items deleted within 48 hours of expiration
- No additional cost for TTL
- Deleted items appear in streams
Use Cases:
- Session management
- Log expiration
- Temporary data cleanup
- Regulatory compliance
What’s the key consideration when relying on Aurora Serverless for handling sudden traffic spikes?
Aurora Serverless Scaling:
- Requires 15-60 seconds for scaling
- Creates new capacity optimized for workload
- May pause during very low activity
- Billing per-second based on ACUs
Best Practices:
- Set minimum capacity for critical workloads
- Monitor scaling events
- Configure timeout actions
- Use proper connection management
Wow, that adventure got deep in the weeds! 🚀☁️ I hope you enjoyed the journey, and maybe even learned a thing or two about AWS Storage Services.
Check out more of Dan’s challenges! 🧠
Legal: This quiz is for educational purposes only. All trademarks & copyrights are property of their respective owners, especially the big guys.