第二次训练¶
本部分内容综合几本经典数据科学导论给出的训练内容,请依据顺序完成,并思考其中的逻辑关系。
数据类型与数据结构¶
基本数据类型¶
Integers 整型¶
In [13]:
Copied!
a = 10
type(a)
a = 10
type(a)
Out[13]:
int
In [15]:
Copied!
a.bit_length()
a.bit_length()
Out[15]:
4
In [17]:
Copied!
a = 100000
a.bit_length()
a = 100000
a.bit_length()
Out[17]:
17
In [19]:
Copied!
googol = 10 ** 100
googol
googol = 10 ** 100
googol
Out[19]:
10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
In [21]:
Copied!
googol.bit_length()
googol.bit_length()
Out[21]:
333
In [23]:
Copied!
1 + 4
1 + 4
Out[23]:
5
In [25]:
Copied!
1 / 4
1 / 4
Out[25]:
0.25
In [27]:
Copied!
type(1 / 4)
type(1 / 4)
Out[27]:
float
Floats 浮点型¶
In [30]:
Copied!
1.6 / 4
1.6 / 4
Out[30]:
0.4
In [32]:
Copied!
type (1.6 / 4)
type (1.6 / 4)
Out[32]:
float
In [34]:
Copied!
b = 0.35
type(b)
b = 0.35
type(b)
Out[34]:
float
In [36]:
Copied!
b + 0.1
b + 0.1
Out[36]:
0.44999999999999996
In [38]:
Copied!
c = 0.5
c.as_integer_ratio()
c = 0.5
c.as_integer_ratio()
Out[38]:
(1, 2)
In [40]:
Copied!
b.as_integer_ratio()
b.as_integer_ratio()
Out[40]:
(3152519739159347, 9007199254740992)
In [42]:
Copied!
import decimal
from decimal import Decimal
import decimal
from decimal import Decimal
In [44]:
Copied!
decimal.getcontext()
decimal.getcontext()
Out[44]:
Context(prec=28, rounding=ROUND_HALF_EVEN, Emin=-999999, Emax=999999, capitals=1, clamp=0, flags=[], traps=[InvalidOperation, DivisionByZero, Overflow])
In [46]:
Copied!
d = Decimal(1) / Decimal (11)
d
d = Decimal(1) / Decimal (11)
d
Out[46]:
Decimal('0.09090909090909090909090909091')
In [48]:
Copied!
decimal.getcontext().prec = 4
decimal.getcontext().prec = 4
In [50]:
Copied!
e = Decimal(1) / Decimal (11)
e
e = Decimal(1) / Decimal (11)
e
Out[50]:
Decimal('0.09091')
In [52]:
Copied!
decimal.getcontext().prec = 50
decimal.getcontext().prec = 50
In [54]:
Copied!
f = Decimal(1) / Decimal (11)
f
f = Decimal(1) / Decimal (11)
f
Out[54]:
Decimal('0.090909090909090909090909090909090909090909090909091')
In [56]:
Copied!
g = d + e + f
g
g = d + e + f
g
Out[56]:
Decimal('0.27272818181818181818181818181909090909090909090909')
Boolean 布尔型¶
In [59]:
Copied!
import keyword
import keyword
In [61]:
Copied!
keyword.kwlist
keyword.kwlist
Out[61]:
['False', 'None', 'True', 'and', 'as', 'assert', 'async', 'await', 'break', 'class', 'continue', 'def', 'del', 'elif', 'else', 'except', 'finally', 'for', 'from', 'global', 'if', 'import', 'in', 'is', 'lambda', 'nonlocal', 'not', 'or', 'pass', 'raise', 'return', 'try', 'while', 'with', 'yield']
In [63]:
Copied!
4 > 3
4 > 3
Out[63]:
True
In [65]:
Copied!
type(4 > 3)
type(4 > 3)
Out[65]:
bool
In [67]:
Copied!
type(False)
type(False)
Out[67]:
bool
In [69]:
Copied!
4 >= 3
4 >= 3
Out[69]:
True
In [71]:
Copied!
4 < 3
4 < 3
Out[71]:
False
In [73]:
Copied!
4 <= 3
4 <= 3
Out[73]:
False
In [75]:
Copied!
4 == 3
4 == 3
Out[75]:
False
In [77]:
Copied!
4 != 3
4 != 3
Out[77]:
True
In [79]:
Copied!
True and True
True and True
Out[79]:
True
In [81]:
Copied!
True and False
True and False
Out[81]:
False
In [83]:
Copied!
False and False
False and False
Out[83]:
False
In [85]:
Copied!
True or True
True or True
Out[85]:
True
In [87]:
Copied!
True or False
True or False
Out[87]:
True
In [89]:
Copied!
False or False
False or False
Out[89]:
False
In [91]:
Copied!
not True
not True
Out[91]:
False
In [93]:
Copied!
not False
not False
Out[93]:
True
In [95]:
Copied!
(4 > 3) and (2 > 3)
(4 > 3) and (2 > 3)
Out[95]:
False
In [97]:
Copied!
(4 == 3) or (2 != 3)
(4 == 3) or (2 != 3)
Out[97]:
True
In [99]:
Copied!
not (4 != 4)
not (4 != 4)
Out[99]:
True
In [101]:
Copied!
(not (4 != 4)) and (2 == 3)
(not (4 != 4)) and (2 == 3)
Out[101]:
False
In [103]:
Copied!
if 4 > 3:
print('condition true')
if 4 > 3:
print('condition true')
condition true
In [105]:
Copied!
i = 0
while i < 4:
print('condition true, i = ', i)
i += 1
i = 0
while i < 4:
print('condition true, i = ', i)
i += 1
condition true, i = 0 condition true, i = 1 condition true, i = 2 condition true, i = 3
In [107]:
Copied!
int(True)
int(True)
Out[107]:
1
In [109]:
Copied!
int(False)
int(False)
Out[109]:
0
In [111]:
Copied!
float(True)
float(True)
Out[111]:
1.0
In [113]:
Copied!
float(False)
float(False)
Out[113]:
0.0
In [115]:
Copied!
bool(0)
bool(0)
Out[115]:
False
In [117]:
Copied!
bool(0.0)
bool(0.0)
Out[117]:
False
In [119]:
Copied!
bool(1)
bool(1)
Out[119]:
True
In [121]:
Copied!
bool(10.5)
bool(10.5)
Out[121]:
True
In [123]:
Copied!
bool(-2)
bool(-2)
Out[123]:
True
Strings 字符型¶
In [126]:
Copied!
t = 'this is a string object'
t = 'this is a string object'
In [128]:
Copied!
t.capitalize()
t.capitalize()
Out[128]:
'This is a string object'
In [130]:
Copied!
t.split()
t.split()
Out[130]:
['this', 'is', 'a', 'string', 'object']
In [132]:
Copied!
t.find('string')
t.find('string')
Out[132]:
10
In [134]:
Copied!
t.find('Python')
t.find('Python')
Out[134]:
-1
In [136]:
Copied!
t.replace(' ', '|')
t.replace(' ', '|')
Out[136]:
'this|is|a|string|object'
In [138]:
Copied!
'http://www.python.org'.strip('htp:/')
'http://www.python.org'.strip('htp:/')
Out[138]:
'www.python.org'
In [140]:
Copied!
a = "我是张三"
print(a)
a = "我是张三"
print(a)
我是张三
In [142]:
Copied!
张三 = 100
李四 = 300
print(张三 + 李四)
张三 = 100
李四 = 300
print(张三 + 李四)
400
循环: 打印与字符替代¶
In [145]:
Copied!
print('Python for Finance')
print('Python for Finance')
Python for Finance
In [147]:
Copied!
print(t)
print(t)
this is a string object
In [149]:
Copied!
i = 0
while i < 4:
print(i)
i += 1
i = 0
while i < 4:
print(i)
i += 1
0 1 2 3
In [151]:
Copied!
i = 0
while i < 4:
print(i, end='|')
i += 1
i = 0
while i < 4:
print(i, end='|')
i += 1
0|1|2|3|
In [153]:
Copied!
'this is an integer %d' % 15
'this is an integer %d' % 15
Out[153]:
'this is an integer 15'
In [155]:
Copied!
'this is an integer %4d' % 15
'this is an integer %4d' % 15
Out[155]:
'this is an integer 15'
In [157]:
Copied!
'this is an integer %04d' % 15
'this is an integer %04d' % 15
Out[157]:
'this is an integer 0015'
In [159]:
Copied!
'this is a float %f' % 15.3456
'this is a float %f' % 15.3456
Out[159]:
'this is a float 15.345600'
In [161]:
Copied!
'this is a float %.2f' % 15.3456
'this is a float %.2f' % 15.3456
Out[161]:
'this is a float 15.35'
In [163]:
Copied!
'this is a float %8f' % 15.3456
'this is a float %8f' % 15.3456
Out[163]:
'this is a float 15.345600'
In [165]:
Copied!
'this is a float %8.2f' % 15.3456
'this is a float %8.2f' % 15.3456
Out[165]:
'this is a float 15.35'
In [167]:
Copied!
'this is a float %08.2f' % 15.3456
'this is a float %08.2f' % 15.3456
Out[167]:
'this is a float 00015.35'
In [169]:
Copied!
'this is a string %s' % 'Python'
'this is a string %s' % 'Python'
Out[169]:
'this is a string Python'
In [171]:
Copied!
'this is a string %10s' % 'Python'
'this is a string %10s' % 'Python'
Out[171]:
'this is a string Python'
In [173]:
Copied!
'this is an integer {:d}'.format(15)
'this is an integer {:d}'.format(15)
Out[173]:
'this is an integer 15'
In [175]:
Copied!
'this is an integer {:4d}'.format(15)
'this is an integer {:4d}'.format(15)
Out[175]:
'this is an integer 15'
In [177]:
Copied!
'this is an integer {:04d}'.format(15)
'this is an integer {:04d}'.format(15)
Out[177]:
'this is an integer 0015'
In [179]:
Copied!
'this is a float {:f}'.format(15.3456)
'this is a float {:f}'.format(15.3456)
Out[179]:
'this is a float 15.345600'
In [181]:
Copied!
'this is a float {:.2f}'.format(15.3456)
'this is a float {:.2f}'.format(15.3456)
Out[181]:
'this is a float 15.35'
In [183]:
Copied!
'this is a float {:8f}'.format(15.3456)
'this is a float {:8f}'.format(15.3456)
Out[183]:
'this is a float 15.345600'
In [185]:
Copied!
'this is a float {:8.2f}'.format(15.3456)
'this is a float {:8.2f}'.format(15.3456)
Out[185]:
'this is a float 15.35'
In [187]:
Copied!
'this is a float {:08.2f}'.format(15.3456)
'this is a float {:08.2f}'.format(15.3456)
Out[187]:
'this is a float 00015.35'
In [189]:
Copied!
'this is a string {:s}'.format('Python')
'this is a string {:s}'.format('Python')
Out[189]:
'this is a string Python'
In [191]:
Copied!
'this is a string {:10s}'.format('Python')
'this is a string {:10s}'.format('Python')
Out[191]:
'this is a string Python '
In [193]:
Copied!
i = 0
while i < 4:
print('the number is %d' % i)
i += 1
i = 0
while i < 4:
print('the number is %d' % i)
i += 1
the number is 0 the number is 1 the number is 2 the number is 3
In [195]:
Copied!
i = 0
while i < 4:
print('the number is {:d}'.format(i))
i += 1
i = 0
while i < 4:
print('the number is {:d}'.format(i))
i += 1
the number is 0 the number is 1 the number is 2 the number is 3
正则表达(选修)¶
In [198]:
Copied!
import re
import re
In [200]:
Copied!
series = """
'01/18/2014 13:00:00', 100, '1st';
'01/18/2014 13:30:00', 110, '2nd';
'01/18/2014 14:00:00', 120, '3rd'
"""
series = """
'01/18/2014 13:00:00', 100, '1st';
'01/18/2014 13:30:00', 110, '2nd';
'01/18/2014 14:00:00', 120, '3rd'
"""
In [202]:
Copied!
dt = re.compile("'[0-9/:\s]+'") # datetime
dt = re.compile("'[0-9/:\s]+'") # datetime
<>:1: SyntaxWarning: invalid escape sequence '\s'
<>:1: SyntaxWarning: invalid escape sequence '\s'
C:\Users\getwa\AppData\Local\Temp\ipykernel_20404\231164877.py:1: SyntaxWarning: invalid escape sequence '\s'
dt = re.compile("'[0-9/:\s]+'") # datetime
In [206]:
Copied!
result = dt.findall(series)
result
result = dt.findall(series)
result
Out[206]:
["'01/18/2014 13:00:00'", "'01/18/2014 13:30:00'", "'01/18/2014 14:00:00'"]
In [208]:
Copied!
from datetime import datetime
pydt = datetime.strptime(result[0].replace("'", ""),
'%m/%d/%Y %H:%M:%S')
pydt
from datetime import datetime
pydt = datetime.strptime(result[0].replace("'", ""),
'%m/%d/%Y %H:%M:%S')
pydt
Out[208]:
datetime.datetime(2014, 1, 18, 13, 0)
In [210]:
Copied!
print(pydt)
print(pydt)
2014-01-18 13:00:00
In [212]:
Copied!
print(type(pydt))
print(type(pydt))
<class 'datetime.datetime'>
基本数据结构¶
Tuples 元组¶
In [216]:
Copied!
t = (1, 2.5, 'data')
type(t)
t = (1, 2.5, 'data')
type(t)
Out[216]:
tuple
In [218]:
Copied!
t = 1, 2.5, 'data'
type(t)
t = 1, 2.5, 'data'
type(t)
Out[218]:
tuple
In [220]:
Copied!
t[2]
t[2]
Out[220]:
'data'
In [222]:
Copied!
type(t[2])
type(t[2])
Out[222]:
str
In [224]:
Copied!
t.count('data')
t.count('data')
Out[224]:
1
In [226]:
Copied!
t.index(1)
t.index(1)
Out[226]:
0
Lists 列表¶
In [229]:
Copied!
l = [1, 2.5, 'data']
l[2]
l = [1, 2.5, 'data']
l[2]
Out[229]:
'data'
In [231]:
Copied!
l = list(t)
l
l = list(t)
l
Out[231]:
[1, 2.5, 'data']
In [233]:
Copied!
type(l)
type(l)
Out[233]:
list
In [235]:
Copied!
l.append([4, 3])
l
l.append([4, 3])
l
Out[235]:
[1, 2.5, 'data', [4, 3]]
In [237]:
Copied!
l.extend([1.0, 1.5, 2.0])
l
l.extend([1.0, 1.5, 2.0])
l
Out[237]:
[1, 2.5, 'data', [4, 3], 1.0, 1.5, 2.0]
In [239]:
Copied!
l.insert(1, 'insert')
l
l.insert(1, 'insert')
l
Out[239]:
[1, 'insert', 2.5, 'data', [4, 3], 1.0, 1.5, 2.0]
In [241]:
Copied!
l.remove('data')
l
l.remove('data')
l
Out[241]:
[1, 'insert', 2.5, [4, 3], 1.0, 1.5, 2.0]
In [243]:
Copied!
p = l.pop(3)
print(l, p)
p = l.pop(3)
print(l, p)
[1, 'insert', 2.5, 1.0, 1.5, 2.0] [4, 3]
In [245]:
Copied!
l[2:5]
l[2:5]
Out[245]:
[2.5, 1.0, 1.5]
Control Structures 控制结构¶
In [248]:
Copied!
for element in l[2:5]:
print(element ** 2)
for element in l[2:5]:
print(element ** 2)
6.25 1.0 2.25
In [250]:
Copied!
r = range(0, 8, 1)
r
r = range(0, 8, 1)
r
Out[250]:
range(0, 8)
In [252]:
Copied!
type(r)
type(r)
Out[252]:
range
In [254]:
Copied!
for i in range(2, 5):
print(l[i] ** 2)
for i in range(2, 5):
print(l[i] ** 2)
6.25 1.0 2.25
In [256]:
Copied!
for i in range(1, 10):
if i % 2 == 0:
print("%d is even" % i)
elif i % 3 == 0:
print("%d is multiple of 3" % i)
else:
print("%d is odd" % i)
for i in range(1, 10):
if i % 2 == 0:
print("%d is even" % i)
elif i % 3 == 0:
print("%d is multiple of 3" % i)
else:
print("%d is odd" % i)
1 is odd 2 is even 3 is multiple of 3 4 is even 5 is odd 6 is even 7 is odd 8 is even 9 is multiple of 3
In [258]:
Copied!
total = 0
while total < 100:
total += 1
print(total)
total = 0
while total < 100:
total += 1
print(total)
100
In [260]:
Copied!
m = [i ** 2 for i in range(5)]
m
m = [i ** 2 for i in range(5)]
m
Out[260]:
[0, 1, 4, 9, 16]
函数编写¶
In [263]:
Copied!
def f(x):
return x ** 2
f(2)
def f(x):
return x ** 2
f(2)
Out[263]:
4
In [265]:
Copied!
def even(x):
return x % 2 == 0
even(3)
def even(x):
return x % 2 == 0
even(3)
Out[265]:
False
In [267]:
Copied!
list(map(even, range(10)))
list(map(even, range(10)))
Out[267]:
[True, False, True, False, True, False, True, False, True, False]
In [269]:
Copied!
list(map(lambda x: x ** 2, range(10)))
list(map(lambda x: x ** 2, range(10)))
Out[269]:
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
In [271]:
Copied!
list(filter(even, range(15)))
list(filter(even, range(15)))
Out[271]:
[0, 2, 4, 6, 8, 10, 12, 14]
Dicts 字典型数据¶
In [274]:
Copied!
d = {
'Name' : 'Angela Merkel',
'Country' : 'Germany',
'Profession' : 'Chancelor',
'Age' : 64
}
type(d)
d = {
'Name' : 'Angela Merkel',
'Country' : 'Germany',
'Profession' : 'Chancelor',
'Age' : 64
}
type(d)
Out[274]:
dict
In [276]:
Copied!
print(d['Name'], d['Age'])
print(d['Name'], d['Age'])
Angela Merkel 64
In [278]:
Copied!
d.keys()
d.keys()
Out[278]:
dict_keys(['Name', 'Country', 'Profession', 'Age'])
In [280]:
Copied!
d.values()
d.values()
Out[280]:
dict_values(['Angela Merkel', 'Germany', 'Chancelor', 64])
In [282]:
Copied!
d.items()
d.items()
Out[282]:
dict_items([('Name', 'Angela Merkel'), ('Country', 'Germany'), ('Profession', 'Chancelor'), ('Age', 64)])
In [284]:
Copied!
birthday = True
if birthday:
d['Age'] += 1
print(d['Age'])
birthday = True
if birthday:
d['Age'] += 1
print(d['Age'])
65
In [286]:
Copied!
for item in d.items():
print(item)
for item in d.items():
print(item)
('Name', 'Angela Merkel')
('Country', 'Germany')
('Profession', 'Chancelor')
('Age', 65)
In [288]:
Copied!
for value in d.values():
print(type(value))
for value in d.values():
print(type(value))
<class 'str'> <class 'str'> <class 'str'> <class 'int'>
Sets 集合¶
In [291]:
Copied!
s = set(['u', 'd', 'ud', 'du', 'd', 'du'])
s
s = set(['u', 'd', 'ud', 'du', 'd', 'du'])
s
Out[291]:
{'d', 'du', 'u', 'ud'}
In [293]:
Copied!
t = set(['d', 'dd', 'uu', 'u'])
t = set(['d', 'dd', 'uu', 'u'])
In [295]:
Copied!
s.union(t)
s.union(t)
Out[295]:
{'d', 'dd', 'du', 'u', 'ud', 'uu'}
In [297]:
Copied!
s.intersection(t)
s.intersection(t)
Out[297]:
{'d', 'u'}
In [299]:
Copied!
s.difference(t)
s.difference(t)
Out[299]:
{'du', 'ud'}
In [301]:
Copied!
t.difference(s)
t.difference(s)
Out[301]:
{'dd', 'uu'}
In [303]:
Copied!
s.symmetric_difference(t)
s.symmetric_difference(t)
Out[303]:
{'dd', 'du', 'ud', 'uu'}
In [305]:
Copied!
from random import randint
l = [randint(0, 10) for i in range(1000)]
len(l)
from random import randint
l = [randint(0, 10) for i in range(1000)]
len(l)
Out[305]:
1000
In [307]:
Copied!
l[:20]
l[:20]
Out[307]:
[1, 6, 4, 6, 3, 2, 8, 0, 0, 0, 2, 4, 7, 2, 9, 7, 3, 0, 8, 3]
In [309]:
Copied!
s = set(l)
s
s = set(l)
s
Out[309]:
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
本课程训练结束